Within UFO Archives

How One UFO Sighting Becomes Many Records

Duplicate handling is essential because one event can appear in multiple forms, articles, files, and databases.

On this page

  • Why duplicates appear
  • Counting reports versus events
  • Practical deduplication checks
Preview for How One UFO Sighting Becomes Many Records

Introduction

Duplicate reports are one of the easiest ways for UFO databases to look more dramatic than they really are. One sighting can become several records when multiple witnesses file separately, a case is copied from a police file into a newspaper, later summarised in a book, re-entered by a researcher, then imported into a newer database. That does not mean the event is fake, but it does mean that “number of reports” is not the same as “number of distinct UFO events”.

Overview image for Duplicates This distinction matters because UFO catalogues are often cited by their totals: tens of thousands of NUFORC reports, thousands of Project Blue Book sightings, or large merged datasets built from several public sources. Those totals are useful for finding material, but they can inflate the apparent frequency of events unless records are grouped, cross-checked and counted at the event level rather than the report level. CUFOS’s UFOCAT codebook is unusually explicit about this: records may describe the same event from different sources, and the database includes fields intended to group those records into blocks referring to the same incident. [Center for UFO Studies]cufos.orgCenter for UFO Studies

Why duplicates appear

Duplicates arise because UFO reporting is usually decentralised. A witness may submit a web form to NUFORC, speak to a local newspaper, contact MUFON, post photographs online, and later be included in a historical catalogue. Each version may contain slightly different details: a rounded time, a misspelled location, a different shape label, a corrected date, or a second-hand summary. To a database, those variations can make one event look like several.

UFOCAT shows the problem clearly because it was built as a reference catalogue rather than a single clean list of unique incidents. Its guide explains that a typical record reflects one witness or group of witnesses, one event and one source, but in practice witnesses, events and sources are “not always unambiguous”. It also says that records are meant to reflect the source, even when source data are inaccurate, with special codes used to flag suspected or known inaccuracies. [Center for UFO Studies]cufos.orgCenter for UFO Studies

That design is valuable for researchers because it preserves provenance. A newspaper account and an investigator’s file may both matter. But it is dangerous if a reader simply exports all rows and treats each row as one separate UFO. UFOCAT’s own structure anticipates this by using a primary record number and related fields so that filtering by the primary number can retrieve all references to the same event. [Center for UFO Studies]cufos.orgCenter for UFO Studies

The same risk exists in modern public reporting systems. NUFORC describes its databank as the largest independently collected online set of UFO or UAP sighting reports, and it allows browsing by event date, location, shape and posting date. It also grades reports into tiers, including reports judged explainable by human or natural phenomena. [nuforc.org]nuforc.orgData Bank | NUFORCData Bank | NUFORC Those features help users inspect the data, but they do not by themselves guarantee that every entry is a unique event.

Duplicates illustration 1

Counting reports versus events

A report is a piece of testimony or evidence. An event is the thing that may have happened in the sky. The two can diverge in both directions: one event may generate many reports, while one report may describe several objects, several episodes, or a vague memory of an event that occurred long before it was submitted.

Project Blue Book illustrates how public totals can be misunderstood. The US Air Force says it investigated 12,618 sightings from 1947 to 1969, of which 701 remained “unidentified”. [U.S. Air Force]af.milUnidentified Flying Objects and Air Force Project Blue Book > Air Force > Fact Sheet Display… That number is often quoted as if it were a clean count of discrete anomalous events. In practice, Blue Book material came from reports, letters, military channels and investigations collected over more than two decades. The official total is historically important, but it still needs careful reading as a count of investigated sightings, not a laboratory-confirmed count of separate extraordinary objects.

A more transparent model is visible in UFOCAT’s grouping practice. The codebook states that database queries can produce a block of records describing the same event, with entries in the block based on different sources; the chronologically earliest or most complete account is treated as the primary record. [Center for UFO Studies]cufos.orgCenter for UFO Studies That is the key conceptual move: keep the source records, but count the event once when making claims about frequency.

This distinction also changes how multiple witnesses should be read. Several independent reports of the same object can strengthen a case because they may provide different angles, times or locations. But if a catalogue counts them as several “UFOs”, it inflates the event count. A good database therefore needs two layers: one layer for individual reports and one layer for linked events.

How ordinary sky events multiply entries

Inflation is not limited to archival copying. Modern skies contain repeatable, widely visible objects that can generate clusters of reports across a wide area. Starlink satellite trains are a strong example. NUFORC now warns users before filing that a line of lights travelling together on the same course is probably Starlink and asks them not to report such sightings as UFOs. It also points users to Starlink trackers and launch schedules before submission. [nuforc.org]nuforc.orgFile a UFO Report | NUFORCFile a UFO Report | NUFORC

The problem is not that witnesses are necessarily dishonest. A 2022 study of UFO reporting during the COVID-19 period found that Starlink reports became a substantial fraction of UFO reports beginning in 2019 and increasing rapidly in 2020. After coding and removing Starlink sightings, the authors concluded that the number of sightings in 2020 was not greater than in 2019, which shows how one new visible phenomenon can create an apparent reporting surge. [ResearchGate]researchgate.netSource details in endnotes.

Aviation cases show the same mechanism with higher stakes. A 2024 arXiv case study analysed an August 2022 incident in which five pilots on two commercial flights over the Pacific reported a UAP, with photographs and video. The researchers reconstructed the sighting using Starlink orbital data and aircraft tracking data, arguing that the reports were consistent with a recently launched satellite train. [arXiv]arxiv.orgSource details in endnotes. In a database, that single sky event could appear as multiple pilot reports, multiple media items and multiple later analytical records unless it is explicitly linked.

AARO’s public imagery pages also show why counting needs caution. Several entries are individual “reports” with short video clips, some resolved as balloons or birds, others unresolved because available data were insufficient or did not allow precise attribution. [AARO]aaro.milOfficial UAP ImageryAARO UAP Imagery… A row in a repository may therefore represent a clip, a report packet, an unresolved observation, a prosaic object awaiting better attribution, or one angle on a broader incident.

Duplicates illustration 2

The database design choices that cause overcounting

Inflated counts usually come from a few recurring design choices rather than from one simple error.

Row-level counting. The easiest mistake is to count database rows. This works only if every row is known to represent one unique event. UFOCAT’s own guide shows why that assumption fails: blocks of records can describe the same event from different sources. [Center for UFO Studies]cufos.orgCenter for UFO Studies

Merged datasets without source-aware matching. Large modern datasets often combine NUFORC, MUFON, historical catalogues, scraped archives and press accounts. Unless they retain original source identifiers and use event-level matching, the same case can survive several times under slightly different names.

Rounding and uncertain metadata. UFO reports often contain approximate times, vague durations and broad locations. A study of 80,332 UFO sightings from 1906 to 2014 found that more than 41% of reports supposedly occurred at perfect “o’clock” hours, indicating a strong preference for rounded times. [ScienceDirect]sciencedirect.comScienceDirect On the dynamics of reporting data: A case study of UFO sightingsScienceDirect On the dynamics of reporting data: A case study of UFO sightings Deduplication that demands exact time matches will miss many duplicates; deduplication that is too loose may wrongly merge separate sightings from the same evening.

Shape and description drift. One witness may call an object an orb, another a light, another a formation, and a later summary a triangle. If the database treats shape as the main identifier, related reports can split apart.

Publication-chain duplication. A case may enter a catalogue once from an investigator’s notes, again from a local article, and again from a later book. UFOCAT preserves such chains because the sources themselves are useful, but users must not confuse bibliographic richness with event multiplicity. [Center for UFO Studies]cufos.orgCenter for UFO Studies

Practical deduplication checks

Good deduplication does not mean deleting everything that looks repetitive. It means preserving the evidence trail while assigning records to probable event groups. For UFO databases and catalogues, the most useful checks are practical and conservative.

Amazon book picks

Further Reading

Books and field guides related to How One UFO Sighting Becomes Many Records. Use these as the next step if you want deeper reading beyond the article.

BookCover for UFOs and Government

UFOs and Government

By Michael D. Swords

Uses historical cases and official records, helping readers see how the same incident can move through multiple archives and summaries.

  1. Separate source record ID from event ID. Each submission, article, image or official file should keep its own record ID, but related records should also share a higher-level event ID where the match is strong. UFOCAT’s primary record approach is a useful precedent because it keeps source entries while allowing same-event grouping. [Center for UFO Studies]cufos.orgCenter for UFO Studies
  2. Match on time windows, not exact times. Because witnesses round times, a same-event search should use a reasonable window around reported time and duration. A “9:00 pm” report and a “9:07 pm” report from nearby towns may be the same event; two exact “9:00 pm” reports hundreds of miles apart may not be.
  3. Use geography as a corridor, not a point. Satellites, rocket launches, aircraft and re-entries can be visible across large regions. A same-event cluster may run along a flight path, satellite pass or launch visibility zone rather than around a single postcode.
  4. Compare narrative signatures. Phrases such as “line of lights”, “same speed”, “disappeared overhead” and “straight course” can be stronger duplicate clues than shape labels. NUFORC’s Starlink warning gives exactly this kind of pattern description. nuforc.org
  5. Check known-object databases. Before treating a cluster as multiple unknowns, compare it with satellite passes, rocket launches, aircraft tracks, meteor showers, astronomical objects, balloons and drones. The Starlink aviation case shows how orbital and ADS-B data can turn several credible witness reports into one identified event. arXiv
  6. Keep uncertainty flags. Some matches should be labelled probable, possible or rejected. Over-merging is as harmful as under-merging: it can erase genuinely separate events that occurred close together.

Duplicates illustration 3

What inflated counts do to UFO research

Inflated counts distort both public discussion and serious analysis. A map of “sightings” may actually show where people live, where people look up, where reporting forms are popular, where satellites are visible, or where a single widely visible event produced many entries. RAND’s 2023 analysis of NUFORC data used 101,151 public reports across 12,783 US census-designated places, but explicitly warned that its analysis should not be read as an endorsement of any individual report or of the overall quality of the underlying data. RAND Corporation

The same caution applies to trend claims. If a year shows more reports, the increase may reflect more witnesses, better reporting access, media attention, a new aircraft or satellite phenomenon, or a change in database intake. The ScienceDirect study of 80,332 UFO reports found that new reports were sensitive to media broadcasting and daytime hours, which is a reminder that report volume measures human reporting behaviour as well as sky phenomena. ScienceDirect

Inflation can also make unresolved percentages seem more meaningful than they are. AARO reported more than 1,600 UAP reports in its holdings by late 2024 and said hundreds had been resolved to commonplace objects such as balloons, birds, drones, satellites and aircraft. U.S. Department of War(#endnote-10 “Snippet: Dr. Jon Kosloski, Director, AARO, Media Roundtable on the FY24 Consolidated Annual Report on UAP > U.S. Department of War > Transcript …”) If multiple records belong to the same ordinary stimulus, failure to group them can make unresolved backlogs look larger, and later resolution of one event may require updating several linked records.

A better way to read UFO database totals

The safest reading is: report counts measure database activity; event counts measure deduplicated incidents; explanation counts measure the subset that has been investigated enough to classify. These are different numbers, and mixing them produces exaggerated conclusions.

For a casual reader, the practical question is not “How many UFO reports are there?” but “How many distinct events remain after matching same-time, same-place and same-source-chain records?” For a researcher, the further question is “How many of those events have enough reliable data to support analysis?” GEIPAN’s methodology is useful here because it classifies cases using both residual strangeness and consistency, where consistency reflects the quantity and reliability of submitted and collected data. cnes-geipan.fr

The best UFO catalogues therefore do not simply chase bigger totals. They preserve the messy source trail, show how records are linked, distinguish testimony from event, and make it possible to count both ways. A database with 100,000 reports may be less useful than a smaller catalogue that clearly marks duplicates, source chains, uncertainty levels and event groups. In UFO research, the impressive number is rarely the raw count; it is the number that remains after the same sighting has stopped pretending to be many.

Endnotes

  1. Source: cufos.org
    Title: Center for UFO Studies
    Link: https://cufos.org/PDFs/UFOCAT%20Codebook%202023.pdf

  2. Source: nuforc.org
    Title: Data Bank | NUFORC
    Link: https://nuforc.org/databank/

  3. Source: af.mil
    Title: U.S. Air Force
    Link: https://www.af.mil/About-Us/Fact-Sheets/Display/Article/104590/unidentified-flying-objects-and-air-force-project-blue-book/
    Source snippet

    Unidentified Flying Objects and Air Force Project Blue Book > Air Force > Fact Sheet Display...

  4. Source: nuforc.org
    Title: File a UFO Report | NUFORC
    Link: https://nuforc.org/report-a-ufo/

  5. Source: researchgate.net
    Link: https://www.researchgate.net/publication/368458403_Social_factors_and_UFO_reports_was_the_SARS-CoV-2_pandemic_associated_with_an_increase_in_UFO_reporting

  6. Source: arxiv.org
    Link: https://arxiv.org/abs/2403.08155

  7. Source: aaro.mil
    Title: Official UAP Imagery
    Link: https://www.aaro.mil/UAP-Cases/Official-UAP-Imagery/
    Source snippet

    AARO UAP Imagery...

  8. Source: sciencedirect.com
    Title: ScienceDirect On the dynamics of reporting data: A case study of UFO sightings
    Link: https://www.sciencedirect.com/science/article/abs/pii/S0378437122005295

  9. Source: rand.org
    Link: https://www.rand.org/content/dam/rand/pubs/research_reports/RRA2400/RRA2475-1/RAND_RRA2475-1.pdf

  10. Source: war.gov
    Title: U.S. Department of War
    Link: https://www.war.gov/News/Transcripts/Transcript/Article/3965734/dr-jon-kosloski-director-aaro-media-roundtable-on-the-fy24-consolidated-annual/
    Source snippet

    Dr. Jon Kosloski, Director, AARO, Media Roundtable on the FY24 Consolidated Annual Report on UAP > U.S. Department of War > Transcript |...

  11. Source: cnes-geipan.fr
    Title: Methodology | GEIPAN
    Link: https://www.cnes-geipan.fr/en/node/58788

  12. Source: researchgate.net
    Link: https://www.researchgate.net/publication/396123768_Factors_complicating_the_identification_and_processing_of_duplicates_in_bibliographic_records_A_theoretical_perspective

  13. Source: researchgate.net
    Link: https://www.researchgate.net/publication/385588760_Hyperconflation_Recommending_a_Relational_Alternative_to_the_Datacentric_Approach_to_UAP

  14. Source: researchgate.net
    Link: https://www.researchgate.net/publication/376519968_An_environmental_analysis_of_public_UAP_sightings_and_sky_view_potential

  15. Source: cnes-geipan.fr
    Link: https://www.cnes-geipan.fr/en/missions-methodes-et-resultats

  16. Source: cnes-geipan.fr
    Link: https://www.cnes-geipan.fr/en/node/58787

  17. Source: war.gov
    Title: dod examining unidentified anomalous phenomena
    Link: https://www.war.gov/News/News-Stories/Article/Article/3965403/dod-examining-unidentified-anomalous-phenomena/

  18. Source: nuforc.org
    Link: https://nuforc.org/ndx/?id=event

  19. Source: nuforc.org
    Link: https://nuforc.org/

  20. Source: aaro.mil
    Link: https://www.aaro.mil/FAQ/

  21. Source: aaro.mil
    Link: https://www.aaro.mil/

  22. Source: aaro.mil
    Link: https://www.aaro.mil/Portals/136/PDFs/AARO_UAP_Program_Report_User_Guide-20231211.pdf?ver=dJtqTlbDr3HqkIVDW8MP4Q%3D%3D

  23. Source: aaro.mil
    Title: AARO Historical Record Report Vol 1 2024
    Link: https://www.aaro.mil/Portals/136/PDFs/AARO_Historical_Record_Report_Vol_1_2024.pdf

  24. Source: cufos.org
    Link: https://cufos.org/cufos-publications-databases/ufocat/

  25. Source: archives.gov
    Title: Project BLUE BOOK
    Link: https://www.archives.gov/research/military/air-force/ufos

  26. Source: cnes.fr
    Link: https://cnes.fr/en/projects/geipan

  27. Source: arxiv.org
    Link: https://arxiv.org/html/2411.02401v1

  28. Source: Wikipedia
    Title: Project Blue Book
    Link: https://en.wikipedia.org/wiki/Project_Blue_Book

  29. Source: Wikipedia
    Link: https://en.wikipedia.org/wiki/GEIPAN

  30. Source: britannica.com
    Title: Project Blue Book
    Link: https://www.britannica.com/topic/Project-Blue-Book

Additional References

  1. Source: youtube.com
    Link: https://www.youtube.com/watch?v=nraHhvzdZAQ
    Source snippet

    Inside the Data: What 1,000+ Pilot Reports Reveal About UAP | iConnections Webinar with Ryan Graves...

  2. Source: youtube.com
    Link: https://www.youtube.com/watch?v=lTGJt7Gho0w
    Source snippet

    UFO Sightings: How Scientists are Trying to Capture More Data | NOVA | PBS...

  3. Source: youtube.com
    Title: UFO Sightings: How Scientists are Trying to Capture More Data | NOVA | PBS
    Link: https://www.youtube.com/watch?v=qho0N3vv7Gw
    Source snippet

    Governments Using AI To Decode Massive UFO Databases | WION Podcast...

  4. Source: youtube.com
    Title: Governments Using AI To Decode Massive UFO Databases | WION Podcast
    Link: https://www.youtube.com/watch?v=adCsqd_-M94
    Source snippet

    MUFON Unmasks the Truth! | UFO's Over Earth | Discovery Channel...

  5. Source: cia.gov
    Link: https://www.cia.gov/readingroom/document/cia-rdp81r00560r000100010001-0

  6. Source: nsa.gov
    Link: https://www.nsa.gov/portals/75/documents/news-features/declassified-documents/ufo/usaf_fact_sheet_95_03.pdf

  7. Source: reddit.com
    Link: https://www.reddit.com/r/UFOB/comments/1sjoz1f/614505_ufo_sighting_records_from_5_major_databases/

  8. Source: facebook.com
    Link: https://www.facebook.com/6abcActionNews/posts/also-included-in-the-files-is-a-written-account-from-a-senior-us-intelligence-of/1530682285090808/

  9. Source: facebook.com
    Link: https://www.facebook.com/fox9kmsp/posts/a-newly-declassified-video-shown-in-infrared-depicts-an-object-appearing-to-be-a/1466543298848455/

  10. Source: instagram.com
    Link: https://www.instagram.com/p/DYqQblURejG/

Topic Tree

Follow this branch

Parent topic

UFO Archives

Related pages 14

More on this topic 5