Within UFO Archives
How One UFO Sighting Becomes Many Records
Duplicate handling is essential because one event can appear in multiple forms, articles, files, and databases.
On this page
- Why duplicates appear
- Counting reports versus events
- Practical deduplication checks
Page outline Jump by section
Introduction
Duplicate reports are one of the easiest ways for UFO databases to look more dramatic than they really are. One sighting can become several records when multiple witnesses file separately, a case is copied from a police file into a newspaper, later summarised in a book, re-entered by a researcher, then imported into a newer database. That does not mean the event is fake, but it does mean that “number of reports” is not the same as “number of distinct UFO events”.
This distinction matters because UFO catalogues are often cited by their totals: tens of thousands of NUFORC reports, thousands of Project Blue Book sightings, or large merged datasets built from several public sources. Those totals are useful for finding material, but they can inflate the apparent frequency of events unless records are grouped, cross-checked and counted at the event level rather than the report level. CUFOS’s UFOCAT codebook is unusually explicit about this: records may describe the same event from different sources, and the database includes fields intended to group those records into blocks referring to the same incident. [Center for UFO Studies]cufos.orgCenter for UFO Studies
Why duplicates appear
Duplicates arise because UFO reporting is usually decentralised. A witness may submit a web form to NUFORC, speak to a local newspaper, contact MUFON, post photographs online, and later be included in a historical catalogue. Each version may contain slightly different details: a rounded time, a misspelled location, a different shape label, a corrected date, or a second-hand summary. To a database, those variations can make one event look like several.
UFOCAT shows the problem clearly because it was built as a reference catalogue rather than a single clean list of unique incidents. Its guide explains that a typical record reflects one witness or group of witnesses, one event and one source, but in practice witnesses, events and sources are “not always unambiguous”. It also says that records are meant to reflect the source, even when source data are inaccurate, with special codes used to flag suspected or known inaccuracies. [Center for UFO Studies]cufos.orgCenter for UFO Studies
That design is valuable for researchers because it preserves provenance. A newspaper account and an investigator’s file may both matter. But it is dangerous if a reader simply exports all rows and treats each row as one separate UFO. UFOCAT’s own structure anticipates this by using a primary record number and related fields so that filtering by the primary number can retrieve all references to the same event. [Center for UFO Studies]cufos.orgCenter for UFO Studies
The same risk exists in modern public reporting systems. NUFORC describes its databank as the largest independently collected online set of UFO or UAP sighting reports, and it allows browsing by event date, location, shape and posting date. It also grades reports into tiers, including reports judged explainable by human or natural phenomena. [nuforc.org]nuforc.orgData Bank | NUFORCData Bank | NUFORC Those features help users inspect the data, but they do not by themselves guarantee that every entry is a unique event.
Counting reports versus events
A report is a piece of testimony or evidence. An event is the thing that may have happened in the sky. The two can diverge in both directions: one event may generate many reports, while one report may describe several objects, several episodes, or a vague memory of an event that occurred long before it was submitted.
Project Blue Book illustrates how public totals can be misunderstood. The US Air Force says it investigated 12,618 sightings from 1947 to 1969, of which 701 remained “unidentified”. [U.S. Air Force]af.milUnidentified Flying Objects and Air Force Project Blue Book > Air Force > Fact Sheet Display… That number is often quoted as if it were a clean count of discrete anomalous events. In practice, Blue Book material came from reports, letters, military channels and investigations collected over more than two decades. The official total is historically important, but it still needs careful reading as a count of investigated sightings, not a laboratory-confirmed count of separate extraordinary objects.
A more transparent model is visible in UFOCAT’s grouping practice. The codebook states that database queries can produce a block of records describing the same event, with entries in the block based on different sources; the chronologically earliest or most complete account is treated as the primary record. [Center for UFO Studies]cufos.orgCenter for UFO Studies That is the key conceptual move: keep the source records, but count the event once when making claims about frequency.
This distinction also changes how multiple witnesses should be read. Several independent reports of the same object can strengthen a case because they may provide different angles, times or locations. But if a catalogue counts them as several “UFOs”, it inflates the event count. A good database therefore needs two layers: one layer for individual reports and one layer for linked events.
How ordinary sky events multiply entries
Inflation is not limited to archival copying. Modern skies contain repeatable, widely visible objects that can generate clusters of reports across a wide area. Starlink satellite trains are a strong example. NUFORC now warns users before filing that a line of lights travelling together on the same course is probably Starlink and asks them not to report such sightings as UFOs. It also points users to Starlink trackers and launch schedules before submission. [nuforc.org]nuforc.orgFile a UFO Report | NUFORCFile a UFO Report | NUFORC
The problem is not that witnesses are necessarily dishonest. A 2022 study of UFO reporting during the COVID-19 period found that Starlink reports became a substantial fraction of UFO reports beginning in 2019 and increasing rapidly in 2020. After coding and removing Starlink sightings, the authors concluded that the number of sightings in 2020 was not greater than in 2019, which shows how one new visible phenomenon can create an apparent reporting surge. [ResearchGate]researchgate.netSource details in endnotes.
Aviation cases show the same mechanism with higher stakes. A 2024 arXiv case study analysed an August 2022 incident in which five pilots on two commercial flights over the Pacific reported a UAP, with photographs and video. The researchers reconstructed the sighting using Starlink orbital data and aircraft tracking data, arguing that the reports were consistent with a recently launched satellite train. [arXiv]arxiv.orgSource details in endnotes. In a database, that single sky event could appear as multiple pilot reports, multiple media items and multiple later analytical records unless it is explicitly linked.
AARO’s public imagery pages also show why counting needs caution. Several entries are individual “reports” with short video clips, some resolved as balloons or birds, others unresolved because available data were insufficient or did not allow precise attribution. [AARO]aaro.milOfficial UAP ImageryAARO UAP Imagery… A row in a repository may therefore represent a clip, a report packet, an unresolved observation, a prosaic object awaiting better attribution, or one angle on a broader incident.
The database design choices that cause overcounting
Inflated counts usually come from a few recurring design choices rather than from one simple error.
Row-level counting. The easiest mistake is to count database rows. This works only if every row is known to represent one unique event. UFOCAT’s own guide shows why that assumption fails: blocks of records can describe the same event from different sources. [Center for UFO Studies]cufos.orgCenter for UFO Studies
Merged datasets without source-aware matching. Large modern datasets often combine NUFORC, MUFON, historical catalogues, scraped archives and press accounts. Unless they retain original source identifiers and use event-level matching, the same case can survive several times under slightly different names.
Rounding and uncertain metadata. UFO reports often contain approximate times, vague durations and broad locations. A study of 80,332 UFO sightings from 1906 to 2014 found that more than 41% of reports supposedly occurred at perfect “o’clock” hours, indicating a strong preference for rounded times. [ScienceDirect]sciencedirect.comScienceDirect On the dynamics of reporting data: A case study of UFO sightingsScienceDirect On the dynamics of reporting data: A case study of UFO sightings Deduplication that demands exact time matches will miss many duplicates; deduplication that is too loose may wrongly merge separate sightings from the same evening.
Shape and description drift. One witness may call an object an orb, another a light, another a formation, and a later summary a triangle. If the database treats shape as the main identifier, related reports can split apart.
Publication-chain duplication. A case may enter a catalogue once from an investigator’s notes, again from a local article, and again from a later book. UFOCAT preserves such chains because the sources themselves are useful, but users must not confuse bibliographic richness with event multiplicity. [Center for UFO Studies]cufos.orgCenter for UFO Studies
Practical deduplication checks
Good deduplication does not mean deleting everything that looks repetitive. It means preserving the evidence trail while assigning records to probable event groups. For UFO databases and catalogues, the most useful checks are practical and conservative.
Amazon book picks
Further Reading
Books and field guides related to How One UFO Sighting Becomes Many Records. Use these as the next step if you want deeper reading beyond the article.
The UFO Experience
Explains how sightings should be classified and assessed, supporting careful separation of reports from events.
UFOs and Government
Uses historical cases and official records, helping readers see how the same incident can move through multiple archives and summaries.
The Signal and the Noise
Good companion for thinking about noise, weak signals and pattern overreading in large UFO sighting datasets.
How to Lie with Statistics
Directly supports the duplicate-record problem by showing how raw numbers can create false impressions.
- Separate source record ID from event ID. Each submission, article, image or official file should keep its own record ID, but related records should also share a higher-level event ID where the match is strong. UFOCAT’s primary record approach is a useful precedent because it keeps source entries while allowing same-event grouping. [Center for UFO Studies]cufos.orgCenter for UFO Studies
- Match on time windows, not exact times. Because witnesses round times, a same-event search should use a reasonable window around reported time and duration. A “9:00 pm” report and a “9:07 pm” report from nearby towns may be the same event; two exact “9:00 pm” reports hundreds of miles apart may not be.
- Use geography as a corridor, not a point. Satellites, rocket launches, aircraft and re-entries can be visible across large regions. A same-event cluster may run along a flight path, satellite pass or launch visibility zone rather than around a single postcode.
- Compare narrative signatures. Phrases such as “line of lights”, “same speed”, “disappeared overhead” and “straight course” can be stronger duplicate clues than shape labels. NUFORC’s Starlink warning gives exactly this kind of pattern description. nuforc.org
- Check known-object databases. Before treating a cluster as multiple unknowns, compare it with satellite passes, rocket launches, aircraft tracks, meteor showers, astronomical objects, balloons and drones. The Starlink aviation case shows how orbital and ADS-B data can turn several credible witness reports into one identified event. arXiv
- Keep uncertainty flags. Some matches should be labelled probable, possible or rejected. Over-merging is as harmful as under-merging: it can erase genuinely separate events that occurred close together.
What inflated counts do to UFO research
Inflated counts distort both public discussion and serious analysis. A map of “sightings” may actually show where people live, where people look up, where reporting forms are popular, where satellites are visible, or where a single widely visible event produced many entries. RAND’s 2023 analysis of NUFORC data used 101,151 public reports across 12,783 US census-designated places, but explicitly warned that its analysis should not be read as an endorsement of any individual report or of the overall quality of the underlying data. RAND Corporation
The same caution applies to trend claims. If a year shows more reports, the increase may reflect more witnesses, better reporting access, media attention, a new aircraft or satellite phenomenon, or a change in database intake. The ScienceDirect study of 80,332 UFO reports found that new reports were sensitive to media broadcasting and daytime hours, which is a reminder that report volume measures human reporting behaviour as well as sky phenomena. ScienceDirect
| Inflation can also make unresolved percentages seem more meaningful than they are. AARO reported more than 1,600 UAP reports in its holdings by late 2024 and said hundreds had been resolved to commonplace objects such as balloons, birds, drones, satellites and aircraft. U.S. Department of War(#endnote-10 “Snippet: Dr. Jon Kosloski, Director, AARO, Media Roundtable on the FY24 Consolidated Annual Report on UAP > U.S. Department of War > Transcript | …”) If multiple records belong to the same ordinary stimulus, failure to group them can make unresolved backlogs look larger, and later resolution of one event may require updating several linked records. |
A better way to read UFO database totals
The safest reading is: report counts measure database activity; event counts measure deduplicated incidents; explanation counts measure the subset that has been investigated enough to classify. These are different numbers, and mixing them produces exaggerated conclusions.
For a casual reader, the practical question is not “How many UFO reports are there?” but “How many distinct events remain after matching same-time, same-place and same-source-chain records?” For a researcher, the further question is “How many of those events have enough reliable data to support analysis?” GEIPAN’s methodology is useful here because it classifies cases using both residual strangeness and consistency, where consistency reflects the quantity and reliability of submitted and collected data. cnes-geipan.fr
The best UFO catalogues therefore do not simply chase bigger totals. They preserve the messy source trail, show how records are linked, distinguish testimony from event, and make it possible to count both ways. A database with 100,000 reports may be less useful than a smaller catalogue that clearly marks duplicates, source chains, uncertainty levels and event groups. In UFO research, the impressive number is rarely the raw count; it is the number that remains after the same sighting has stopped pretending to be many.
Endnotes
-
Source: cufos.org
Title: Center for UFO Studies
Link: https://cufos.org/PDFs/UFOCAT%20Codebook%202023.pdf -
Source: nuforc.org
Title: Data Bank | NUFORC
Link: https://nuforc.org/databank/ -
Source: af.mil
Title: U.S. Air Force
Link: https://www.af.mil/About-Us/Fact-Sheets/Display/Article/104590/unidentified-flying-objects-and-air-force-project-blue-book/Source snippet
Unidentified Flying Objects and Air Force Project Blue Book > Air Force > Fact Sheet Display...
-
Source: nuforc.org
Title: File a UFO Report | NUFORC
Link: https://nuforc.org/report-a-ufo/ -
Source: researchgate.net
Link: https://www.researchgate.net/publication/368458403_Social_factors_and_UFO_reports_was_the_SARS-CoV-2_pandemic_associated_with_an_increase_in_UFO_reporting -
Source: arxiv.org
Link: https://arxiv.org/abs/2403.08155 -
Source: aaro.mil
Title: Official UAP Imagery
Link: https://www.aaro.mil/UAP-Cases/Official-UAP-Imagery/Source snippet
AARO UAP Imagery...
-
Source: sciencedirect.com
Title: ScienceDirect On the dynamics of reporting data: A case study of UFO sightings
Link: https://www.sciencedirect.com/science/article/abs/pii/S0378437122005295 -
Source: rand.org
Link: https://www.rand.org/content/dam/rand/pubs/research_reports/RRA2400/RRA2475-1/RAND_RRA2475-1.pdf -
Source: war.gov
Title: U.S. Department of War
Link: https://www.war.gov/News/Transcripts/Transcript/Article/3965734/dr-jon-kosloski-director-aaro-media-roundtable-on-the-fy24-consolidated-annual/Source snippet
Dr. Jon Kosloski, Director, AARO, Media Roundtable on the FY24 Consolidated Annual Report on UAP > U.S. Department of War > Transcript |...
-
Source: cnes-geipan.fr
Title: Methodology | GEIPAN
Link: https://www.cnes-geipan.fr/en/node/58788 -
Source: researchgate.net
Link: https://www.researchgate.net/publication/396123768_Factors_complicating_the_identification_and_processing_of_duplicates_in_bibliographic_records_A_theoretical_perspective -
Source: researchgate.net
Link: https://www.researchgate.net/publication/385588760_Hyperconflation_Recommending_a_Relational_Alternative_to_the_Datacentric_Approach_to_UAP -
Source: researchgate.net
Link: https://www.researchgate.net/publication/376519968_An_environmental_analysis_of_public_UAP_sightings_and_sky_view_potential -
Source: cnes-geipan.fr
Link: https://www.cnes-geipan.fr/en/missions-methodes-et-resultats -
Source: cnes-geipan.fr
Link: https://www.cnes-geipan.fr/en/node/58787 -
Source: war.gov
Title: dod examining unidentified anomalous phenomena
Link: https://www.war.gov/News/News-Stories/Article/Article/3965403/dod-examining-unidentified-anomalous-phenomena/ -
Source: nuforc.org
Link: https://nuforc.org/ndx/?id=event -
Source: nuforc.org
Link: https://nuforc.org/ -
Source: aaro.mil
Link: https://www.aaro.mil/FAQ/ -
Source: aaro.mil
Link: https://www.aaro.mil/ -
Source: aaro.mil
Link: https://www.aaro.mil/Portals/136/PDFs/AARO_UAP_Program_Report_User_Guide-20231211.pdf?ver=dJtqTlbDr3HqkIVDW8MP4Q%3D%3D -
Source: aaro.mil
Title: AARO Historical Record Report Vol 1 2024
Link: https://www.aaro.mil/Portals/136/PDFs/AARO_Historical_Record_Report_Vol_1_2024.pdf -
Source: cufos.org
Link: https://cufos.org/cufos-publications-databases/ufocat/ -
Source: archives.gov
Title: Project BLUE BOOK
Link: https://www.archives.gov/research/military/air-force/ufos -
Source: cnes.fr
Link: https://cnes.fr/en/projects/geipan -
Source: arxiv.org
Link: https://arxiv.org/html/2411.02401v1 -
Source: Wikipedia
Title: Project Blue Book
Link: https://en.wikipedia.org/wiki/Project_Blue_Book -
Source: Wikipedia
Link: https://en.wikipedia.org/wiki/GEIPAN -
Source: britannica.com
Title: Project Blue Book
Link: https://www.britannica.com/topic/Project-Blue-Book
Additional References
-
Source: youtube.com
Link: https://www.youtube.com/watch?v=nraHhvzdZAQSource snippet
Inside the Data: What 1,000+ Pilot Reports Reveal About UAP | iConnections Webinar with Ryan Graves...
-
Source: youtube.com
Link: https://www.youtube.com/watch?v=lTGJt7Gho0wSource snippet
UFO Sightings: How Scientists are Trying to Capture More Data | NOVA | PBS...
-
Source: youtube.com
Title: UFO Sightings: How Scientists are Trying to Capture More Data | NOVA | PBS
Link: https://www.youtube.com/watch?v=qho0N3vv7GwSource snippet
Governments Using AI To Decode Massive UFO Databases | WION Podcast...
-
Source: youtube.com
Title: Governments Using AI To Decode Massive UFO Databases | WION Podcast
Link: https://www.youtube.com/watch?v=adCsqd_-M94Source snippet
MUFON Unmasks the Truth! | UFO's Over Earth | Discovery Channel...
-
Source: cia.gov
Link: https://www.cia.gov/readingroom/document/cia-rdp81r00560r000100010001-0 -
Source: nsa.gov
Link: https://www.nsa.gov/portals/75/documents/news-features/declassified-documents/ufo/usaf_fact_sheet_95_03.pdf -
Source: reddit.com
Link: https://www.reddit.com/r/UFOB/comments/1sjoz1f/614505_ufo_sighting_records_from_5_major_databases/ -
Source: facebook.com
Link: https://www.facebook.com/6abcActionNews/posts/also-included-in-the-files-is-a-written-account-from-a-senior-us-intelligence-of/1530682285090808/ -
Source: facebook.com
Link: https://www.facebook.com/fox9kmsp/posts/a-newly-declassified-video-shown-in-infrared-depicts-an-object-appearing-to-be-a/1466543298848455/ -
Source: instagram.com
Link: https://www.instagram.com/p/DYqQblURejG/
Topic Tree
Follow this branch
Parent topic
UFO ArchivesRelated pages 14
- AARO Why AARO Cases Remain Unresolved
- Archives Finding Original UFO Records in Archives
- Blue Book What Project Blue Book Records Still Reveal
- Clusters Why UFO Sightings Cluster on the Map
- Enigma Can a UFO App Fix Old Data Problems?
- GEIPAN How France Classifies Public UAP Cases
- Misidentifications Why Ordinary Objects Fill UFO Databases
- MUFON How MUFON Turns Sightings Into Cases
- +6 more in sidebar