Secret changes to major U.S. health datasets raise alarms

bnew

Veteran
Joined
Nov 1, 2015
Messages
67,203
Reputation
10,392
Daps
181,614

Secret changes to major U.S. health datasets raise alarms​


by Eric W. Dolan

July 15, 2025

in Exclusive

[Photo by Gage Skidmore]


[Photo by Gage Skidmore]

A new study in the medical journal The Lancet reports that more than 100 United States government health datasets were altered this spring without any public notice. The investigation shows that nearly half of the files examined underwent wording changes while leaving the official change logs blank. The authors warn that hidden edits of this kind can ripple through public health research and erode confidence in federal data.

To reach these findings, the researchers started by downloading the online catalogues—known as harvest sources—that federal agencies maintain under the 2019 Open Government Data Act. They gathered every entry from the Centers for Disease Control and Prevention, the Department of Health and Human Services, and the Department of Veterans Affairs that showed a modification date between January 20 and March 25, 2025.

After removing duplicates and files that are refreshed at least monthly, the team was left with 232 datasets. For each one, they located an archived copy that pre‑dated the study window, most often through the Internet Archive’s Wayback Machine.

They then used the comparison feature in a word‑processing program to highlight every textual difference between the older and newer versions. Only wording was assessed; numeric tables were not rechecked. Finally, the investigators opened the public change log that sits at the bottom of each dataset’s web page to see whether the alteration had been declared.

One example captures how the edits appeared in practice. A file from the Department of Veterans Affairs that tracks the number of veterans using healthcare services in the 2021 fiscal year had sat untouched for more than two years. On March 5, 2025, the column heading “Gender” was replaced with “Sex.” The same swap was made in the dataset’s title and in the short description at the top of the page. The modification date on the site updated to reflect the change, yet the built‑in change log still reads, “No changes have been archived yet.”

Across the full sample, the pattern was strikingly consistent. One hundred fourteen of the 232 datasets—49 percent—contained what the authors judged to be potentially substantive wording changes. Of these, 106 switched the term “gender” to “sex.” Four files replaced the phrase “social determinants of health” with “non‑medical factors,” one exchanged “socio‑economic status” for “socio‑economic characteristics,” and a single clinical trial listing rewrote its title so that “gender diverse” became “include men and women.”

In 89 cases, the revision affected text that defines the data itself, such as column names or category labels. The remaining 25 changes occurred in narrative descriptions or tags that sit above the data table. Only 25 of the 114 altered files—less than one in seven—acknowledged the revision in their official logs.

The timing followed a marked acceleration: four edits occurred in the final days of January, 30 during February, and 82 during the first three and a half weeks of March—suggesting an intensified push as spring approached.

These government datasets form the backbone of countless psychology, sociology, and public health projects. The Behavioral Risk Factor Surveillance System, for instance, supplies yearly survey information on smoking, exercise, diet, and chronic illness across every state. It is routinely mined to study links between health behavior and mental well‑being.

Heart disease and stroke mortality files from the Centers for Disease Control and Prevention help social scientists examine how stress, neighborhood environment, or discrimination align with geographic patterns in illness and death.

Nutrition and physical activity surveys inform work on childhood obesity and its ties to screen time or family structure. Researchers who focus on veteran mental health rely on Department of Veterans Affairs summaries to track service‑connected disability, access to therapy, and suicide risk among former service members.

When variable labels shift from “gender” to “sex” in these resources, studies that compare answers given under the old wording with figures retrieved after the change are no longer aligning like‑with‑like. Even a single undocumented edit can scramble replication attempts, invalidate earlier statistical models, or make it impossible to detect real trends in the underlying population.

The implications stretch beyond statistical concerns. Survey designers distinguish between gender, a social identity, and sex, a biological classification, because the two terms capture related but not identical information. Many transgender and non‑binary respondents, for example, select a gender option that differs from the sex recorded on their birth certificate.

If the government retroactively re‑labels a column without clarifying whether the underlying question also changed, analysts cannot tell whether a fluctuation in the male‑to‑female ratio reflects genuine demographic shifts, a wording tweak, or recoding behind the scenes. Public health officials may then allocate resources on a faulty premise, and medical guidelines that depend on demographic baselines can drift off target.

The authors of the study point to a possible political origin for the edits. They note that the White House issued a directive in early February instructing agencies to purge material seen as advancing “gender ideology”—language echoed by several state administrations.

No federal office has publicly confirmed that the dataset edits were carried out in response, yet the timing and the tight focus on the term “gender” hint at coordinated action. If the goal was to bring terminology across agencies into alignment, the transparency required by the Open Government Data Act appears to have been set aside.

The investigation is not without limits. Because many archives extend back only a few years, the researchers could not examine earlier periods for similar actions. They judged whether a change was routine or substantive by hand, an approach that introduces subjectivity. They also left numerical content untouched; it remains unknown whether any figures were edited alongside the wording.

In response to the findings, the authors suggest a series of steps that scholars and institutions can take to protect the reliability of public data. Independent groups already mirror many federal datasets on private servers, and individual investigators can save local copies of files they intend to analyze. Routine spot checks against archived versions can help reveal unexpected alterations.

International repositories such as Europe PubMed Central offer alternative hosting for biomedical resources, lowering dependence on any single government. Most important, the researchers argue, is a cultural commitment to full version tracking inside federal agencies—so that every member of the public can see exactly what changed, when it changed, and why.

The study, “Data manipulation within the US Federal Government,” was authored by Janet Freilich and Aaron S. Kesselheim.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
67,203
Reputation
10,392
Daps
181,614
Health Data Danger ...


Posted on Sun Aug 3 06:29:08 2025 UTC

e8hho6ls0rgf1.jpeg





1/37
@GeauxGabrielle
What concerns me a little is that I never typed out HIPAA. So whomever created this now viral screenshot that is on 4 different apps now changed it to spell it out. While harmless, it scares me because people can make anything look like anything now



GxYzQcxW8AAX2MN.jpg


2/37
@amerine1000
Yeah definitely concerning 😅



3/37
@LeahCushmanNH
Okay but where did you stand on covid contract tracing apps?



4/37
@RabbitTrails
Spelling it out reduces the impact. We all know what HIPPA is



5/37
@MedicTrommasher
You can see exactly where they cut it because the spaces between words increase after “partly…”, there’s less space between the 3rd and last paragraphs, and the added words are a slightly different size/sharpness bc there are 4 possible font choices…



6/37
@carolj6
Don’t do it



7/37
@charisse_sky
Idk why but the spelling out of HIPAA makes me very uneasy. A lot of people don’t know that’s what HIPAA stands for and somehow, spelling it out like that lessens the reality and sinister undertone of what they’re trying to do. Idk if it’s just me or what 😭



8/37
@LayeredPraise
No doubt. The Information Age has devolved into the Disinformation Epoch.



9/37
@LoveNJosie
Thanks to not having to disclose when things have been altered. Having no audit trail for digital content of any kind is troubling



10/37
@The_royalmo
Wow! 😨



11/37
@Moms_know_
You surely didn’t I reposted your original version (first screenshot ) and then that’s the updated one (second screenshot) . That’s definitely not ok .



GxZEbLza8AAg7x-.jpg

GxZEbLzbYAAIfoT.jpg


12/37
@midnucas
This has been possible since the dawn of the web. Easier even in the past 15 years where every desktop web browser comes with developer tools that allow you to easily edit your computer’s local copy of any page and then screenshot it.



13/37
@venusianvampire
That is absolutely terrifying. I'm so scared for our future. I'm like so close to being like "sorry if the date stamp doesn't say pre-2024 I don't believe it thank you!!"



14/37
@eiddor
I was wondering why it looked so off.



15/37
@caveatdata
During the pandemic an insurance company sent me a list of names who hadn’t been vaccinated. I’m pro vaxx, and I was horrified they would send me a list. Not all of the names were treated at my non-medical clinic. 🤬



16/37
@SageThatOne
I found it off without seeing this tweet simply bc you didn’t have a blue check and that add on definitely went over the character limit lol. It is insane how exact they got in font sizing and everything.



17/37
@fagmyer
I follow you so i saw the original. Great you went viral but adding or changing anything is dangerous.



18/37
@queenlondy
Wow that’s nuts



19/37
@SabioScientist
That’s scary AF.



20/37
@MahoganyMonro
Whoa you sure the hell didnt!!!!! I remember reading that tweet!!!! Oh hell nawwww



21/37
@StephLorenzo
That’s so bizarre!



22/37
@turtle_taked0wn
Yeah I remember seeing your original tweet! I'd just thought I shortened it to the acronym in my head when the screenshot started going around though



23/37
@wheetz
That is scary



24/37
@Islandgirlpixie
It’s scary because now if someone gets mad over ANY opinion we express, if they know how to do it they could make it *appear* as if we said something absolutely horrific & possibly criminal.



25/37
@MsNeverOnTime
I looked at the screenshot and asked myself did she edit that?!
Because HIPAA was never spelled out in her original tweet.
Thats so scary and I wonder why they even bothered to spell it out



26/37
@kimobra
I saw your original - then all these versions and wondered who edited and reposted it. It feels like there should be a watermark you could add to tweets - which is crazy.



27/37
@nobodyafrn1
And we have no recourse. No guardrails like we had with television. And unfortunately, the folks who run these platforms have all the power and influence



28/37
@crunchbirdcurls
I remember seeing this tweet and you definitely used the abbreviation. This is crazy.



29/37
@can_tdisplay
Yes.
Every day, I doubt what's posted on social media, more and more.



30/37
@MaxxMilly808
Wow - you’re not kidding 🤨



31/37
@JudyCol29411986
Isn't he worried that we'll find out his real height and weight?



32/37
@Gg9PqmJTJ46Sh5h
creepy

[Quoted tweet]
Hi! Epidemiologist here!

Don’t do this.

In the 80s and 90s this was done by Depts. of Health during the HIV/AIDS epidemic. “Somehow” the lists they created found their way into the hands of cops, landlords, & job sites to deny people human rights.

It’s partly WHY HIPAA exists


33/37
@48thAve
You're not wrong.



34/37
@AmySandSand
They changed your tweet. NOT cool no matter what they changed



35/37
@Metehehe
I speak for everyone when I say we all want to know if we work with someone with aids. Sorry



36/37
@SeaSK
GRIDS … usAIDS was a test run for more than just using Immunology as a pop-control vector as mentioned by bc the NAS’s Dr Handler in 1969, a year before the “prolife party” unveiled their moral imperative to depopulate us by Educated Choice, for starters.

[Quoted tweet]
here's a "cattle" mention ... see the "redlines" in pic 4.

If you read the entire report of 1970, you find the GOP argument is this:

If vax were "moral imperative", now that they've been Too Successful, is not Intervention to depopulate equally Moral?

freerepublic.com/focus/news/…


GxZCS6XXkAAAYs0.jpg

GxZCS6VW0AA8Zf1.jpg

GxZCS6WXIAAQOfi.jpg

GxZCS6YXIAAv0JP.jpg

EzgqVkXVUAUOfgC.jpg

EzgqVkRUUAY5i5T.jpg

EzgqVkOUUAMbqdd.jpg

EzgqVk-VIAEfrm0.jpg


37/37
@MichelleHi56630
How about you discuss how Fauci experimented on black foster children with AIDS “vaccines” AND then talk about how many of them WITHOUT AIDS died as a result ?!? 🦗 it’s an easy google search




To post tweets in this format, more info here: https://www.thecoli.com/threads/tips-and-tricks-for-posting-the-coli-megathread.984734/post-52211196
 
Top