Publications
Publication types
Published Articles (38)
-
- Improving Results Reporting in RCT Registriessubmitted, 2026
- A Simulated Reconstruction and Reidentification Attack on the 2010 U.S. CensusHarvard Data Science Review, Jul 2025
- TROV - A Model and Vocabulary for Describing Transparent Research ObjectsInternational Journal of Digital Curation, Feb 2025
- Using Containers to Validate Research on Confidential Data at ScaleHarvard Data Science Review, Jun 2025
- Reproduce to validate: A comprehensive study on the reproducibility of economics researchCanadian Journal of Economics/Revue canadienne d’économique, Aug 2024
- A Guide for Social Science Journal Editors on Easing into Open ScienceResearch Integrity and Peer Review, Feb 2024
- Introduction to the special issue: Models of linked employer–employee data: Twenty years after “High wage workers and high wage firms”Journal of Econometrics, 2023
- Reproducibility and transparency versus privacy and confidentiality: Reflections from a data editorJournal of Econometrics, Jun 2023Published online
- Reinforcing Reproducibility and Replicability: An IntroductionHarvard Data Science Review, Jul 2023
- An Interview with John M. AbowdInternational Statistical Review, Feb 2022
- Teaching for large-scale Reproducibility VerificationJournal of Statistics and Data Science Education, Sep 2022
- On privacy in the age of COVID-19Journal of Privacy and Confidentiality, Feb 2021
- Recalculating ... : How Uncertainty in Local Labour Market Definitions Affects Empirical FindingsApplied Economics, Jan 2021
- metajelo: A metadata package for journals to support external linked objectsInternational Journal of Digital Curation, 2021
- Applying data synthesis for longitudinal business data across three countriesStatistics in Transition New Series, 2020
- Total Error and Variability Measures for the Quarterly Workforce Indicators and Lehd Origin-Destination Employment Statistics in OnthemapJournal of Survey Statistics and Methodology, Nov 2020
- Reproducibility and Replicability in EconomicsHarvard Data Science Review, Dec 2020
- Effects of a Government-Academic Partnership: Has the NSF-Census Bureau Research Network Helped Improve the US Statistical System?Journal of Survey Statistics and Methodology, 2019First published December 2018
- Remembering Stephen FienbergJournal of Privacy and Confidentiality, Dec 2018
- Relaunching the Journal of Privacy and ConfidentialityJournal of Privacy and Confidentiality, Dec 2018
- Understanding the effect of procedural justice on psychological distressInternational Journal of Stress Management, 2017
-
- Using partially synthetic microdata to protect sensitive cells in business statisticsStatistical Journal of the International Association for Official Statistics, 2016
- Synthetic establishment microdata around the worldStatistical Journal of the International Association for Official Statistics, 2016
- Looking back on three years of using the Synthetic LBD betaStatistical Journal of the IAOS: Journal of the International Association for Official Statistics, 2014
- A First Step Towards A German SynLBD: Constructing A German Longitudinal Business DatabaseStatistical Journal of the IAOS: Journal of the International Association for Official Statistics, 2014
- Differential privacy applications to bayesian and linear mixed model estimationJournal of Privacy and Confidentiality, 2013
- Data Management of Confidential DataInternational Journal of Digital Curation, 2013
- Did the Housing Price Bubble Clobber Local Labor Market Job and Worker Flows When It Burst?The American Economic Review, May 2012
- National estimates of gross employment and job flows from the quarterly workforce indicators with demographic and industry detailJournal of Econometrics, 2011Free to read at https://pmc.ncbi.nlm.nih.gov/articles/PMC3079891/
- Science, confidentiality, and the public interestChance, 2011
- Procedural justice criteria in salary determinationJournal of Managerial Psychology, 2008tex.eprint: https://doi.org/10.1108/02683940810894765
- The sensitivity of economic statistics to coding errors in personal identifiersJournal of Business & Economic Statistics, Apr 2005
- Escaping poverty for low-wage workers: The role of employer characteristics and changesIndustrial and Labor Relations Review, Jul 2004tex.alturl: http://www.jstor.org/stable/4126683 tex.owner: vilhuber
- Early career experiences and later career outcomes: Comparing the United States, France and GermanyVierteljahrshefte zur Wirtschaftsforschung, 2001
- La spécificité de la formation en milieu de travail : un survol des contributions théoriques et empiriques récentes,L’Actualité économique, Revue d’analyse économique, Mar 2001
Working Papers (68)
- Assessing Utility of Differential Privacy for RCTsarXiv arxiv:2309.14581v2, 2026Version Number: 2
- Assessing Reproducibility in Economics Using Standardized Crowd-sourced AnalysisNational Bureau of Economic Research, Working Paper 33753, May 2025
- Mass Reproducibility and Replicability: A New HopeI4R Discussion Paper Series, Working Paper 107, 2024
- Crowdsourcing Digital Public Goods: A Field Experiment on Metadata ContributionsSocial Science Research Network, SSRN Scholarly Paper 5008203, Nov 2024
- Protecting Confidential Data through Non-Statistical MethodsCornell University, Document 116054, Oct 2024
- The 2010 Census Confidentiality Protections Failed, Here’s How and WhyarXiv arXiv:2312.11283, Dec 2023arXiv:2312.11283 null
- Assessing Utility of Differential Privacy for RCTsarXiv arXiv:2309.14581v1, Sep 2023arXiv:2309.14581 [cs, econ, stat]
- Reproducibility and Transparency versus Privacy and Confidentiality: Reflections from a Data EditorarXiv, submitted version 2305.14478, 2023Version Number: 1
- Data and Code Availability StandardSocial Science Data Editors, Dec 2022Version Number: 1.0
- An Interview with John M. AbowdCornell University Labor Dynamics Institute Document, Feb 2022
- A template README for social science replication packagesZenodo v1.1.0, Nov 2022
- Teaching for large-scale Reproducibility VerificationarXiv arxiv:2204.01540v1, Mar 2022
- Applying Data Synthesis for Longitudinal Business Data across Three CountriesarXiv arxiv:2008.02246, Jul 2020
- Consumer expectations around COVID-19: Evolution over timeLabor Dynamics Institute, Online, 2020
- A template README for social science replication packagesZenodo, Dec 2020Version Number: v1.0.0
-
-
- Why the Economics Profession Must Actively Participate in the Privacy Protection DebateLabor Dynamics Institute, Cornell University, Document 51, Jan 2019
- Suboptimal Provision of Privacy and Statistical Accuracy When They are Public GoodsarXiv arxiv:1906.09353, Jun 2019
- Cornell Criminal Records Panel Study Questionnaire Wave 2Cornell Criminal Records Panel Study (CRPS), Document 69333, Mar 2019
- metajelo: A metadata package for journals to support external linked objectsLabor Dynamics Institute, Document, 2019
-
- Disclosure Limitation and Confidentiality Protection in Linked DataLabor Dynamics Institute, Cornell University, Document 47, Jan 2018
-
-
- Reproducibility and replicability in economicsNational Academies of Sciences, Engineering, and Medicine, Commissioned Paper, 2018
- Codebook for the SIPP Synthetic Beta 7.0 (PDF version)Cornell Institute for Social and Economic Research and Labor Dynamics Institute. Cornell University, Codebook V20181102b-pdf, Nov 2018
-
-
- Cornell Project for Records Assistance Questionnaire - with routingMay 2017
- Two Perspectives on Commuting: A Comparison of Home to Work Flows Across Job-Linked Survey and Administrative Files.U.S. Census Bureau Center for Economic Studies Discussion, Paper, 2017
- Utility Cost of Formal Privacy for Releasing National Employer-Employee StatisticsLabor Dynamics Institute, Cornell University, Document 36, 2017
- Making Confidential Data Part of Reproducible ResearchLabor Dynamics Institute, Cornell University, Document 41, 2017
- Proceedings from the 2017 Cornell-Census-NSF-Sloan Workshop on Practical PrivacyLabor Dynamics Institute, Cornell University, Document 43, 2017
- Proceedings from the Synthetic LBD International SeminarLabor Dynamics Institute, Cornell University, Document 44, 2017
- Proceedings from the 2016 NSF-Sloan Workshop on Practical PrivacyLabor Dynamics Institute, Cornell University, Document 33, 2017
- LEHD Infrastructure files in the Census RDC - OverviewCenter for Economic Studies, U.S. Census Bureau 14-26, Jun 2014
- Methods for Protecting the Confidentiality of Firm-Level Data: Issues and SolutionsLabor Dynamics Institute 19, Mar 2013
- Dynamically Consistent Noise Infusion and Partially Synthetic Data as Confidentiality Protection Measures for Related Time SeriesU.S. Census Bureau, Center for Economic Studies 12-13, Oct 2012
- New york state disability and employment status report, 2011Cornell University, Employment and Disability Institute, Report on behalf of New York Makes Work Pay Comprehensive Employment System Medicaid Infrastructure Grant, 2011
- LEHD Infrastructure files in the Census RDC-Overview S2008U.S. Census Bureau, Working Papers 11-43, 2011
- LEHD Infrastructure Files in the Census RDC: Overview of S2004 SnapshotU.S. Census Bureau, Working Papers 11-13, Apr 2011
- New york state disability and employment status report, 2009Cornell University, Employment and Disability Institute, Report on behalf of New York Makes Work Pay Comprehensive Employment System Medicaid Infrastructure Grant, 2010
- Measuring firm-level displacement events with administrative dataWorkshop on Measurement Error in Administrative Data, 2010
- Adjusting imperfect data: Overview and case studiesNBER, Working paper 12977, 2007
- Confidentiality Protection in the Census Bureau’s Quarterly Workforce IndicatorsU.S. Census Bureau, LEHD and Cornell University, presented at the Joint Statistical Meetings 2005, Minneapolis, MN. 2006-02, 2005
-
- Abandoning the sinking ship: The composition of worker flows prior to displacementLEHD, U.S. Census Bureau, Technical paper TP-2002-11, 2002
- The creation of the employment dynamics estimatesLEHD, U.S. Census Bureau, Technical paper TP-2002-13, 2002
- The sensitivity of economic statistics to coding errors in personal identifiersLEHD, U.S. Census Bureau, Technical paper TP-2002-17, 2002
- Displaced workers, early leavers, and re-employment wagesLEHD, U.S. Census Bureau, Technical paper TP-2002-18, 2002
- Escaping poverty for low-wage workers: The role of employer characteristics and changesLEHD, U.S. Census Bureau, Technical paper TP-2001-02, 2001
- Longitudinal analysis of SSN response on SIPP 1990-1993 panelsLEHD, U.S. Census Bureau, Technical paper TP-2000-01, 2000
-
- Sector-specific on-the-job training: Evidence from U.S. dataCIRANO, Scientific Series 97s-42, 1997
-
Conference Papers (25)
- Report of the AEA Data EditorIn AEA Papers and Proceedings, May 2025tex.ids= vilhuber2025
- Report of the AEA Data EditorIn AEA Papers and Proceedings, May 2024
- Report of the AEA Data EditorIn AEA Papers and Proceedings, May 2023
- Report by the AEA Data EditorIn AEA Papers and Proceedings, May 2022
- Report by the AEA Data EditorIn AEA Papers and Proceedings, May 2021
- Report by the AEA Data EditorIn AEA Papers and Proceedings, May 2020
- Report by the AEA Data EditorIn AEA Papers and Proceedings, May 2019
- Why the Economics Profession Must Actively Participate in the Privacy Protection DebateIn AEA Papers and Proceedings, May 2019
- Making Confidential Data Part of Reproducible ResearchIn Methods to Foster Transparency and Reproducibility of Federal Statistics: Proceedings of a Workshop, 2019
- Synthetic data via quantile regression for heavy-tailed and heteroskedastic dataIn Privacy in statistical databases, 2018See https://github.com/labordynamicsinstitute/replication_qr_synthetic for replication code.
- Utility Cost of Formal Privacy for Releasing National Employer-Employee StatisticsIn Proceedings of the 2017 International Conference on Management of Data, 2017
- Proceedings from the 2016 NSF–Sloan Workshop on Practical PrivacyIn 2016 NSF–Sloan Workshop on Practical Privacy, 2017
- CED2AR: The Comprehensive Extensible Data Documentation and Access Repository.In ACM/IEEE Joint Conference on Digital Libraries (JCDL 2014), 2014
- Using partially synthetic data to replace suppression in the business dynamics statistics: Early resultsIn Privacy in statistical databases, 2014
- Synthetic longitudinal business databases for international comparisonsIn Privacy in statistical databases, 2014
- CED²AR: The Comprehensive Extensible Data Documentation and Access RepositoryIn ACM/IEEE Joint Conference on Digital Libraries (JCDL 2014), Sep 2014Presented at the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2014)
- Replicating the Synthetic LBD with German establishment dataIn Proceedings 59th ISI world statistics congress, 25-30 august 2013, hong kong (session STS062), 2013
- Encoding Provenance Metadata for Social Science Datasets.In Metadata and Semantics Research. Communications in Computer and Information Science, 2013
- Encoding Provenance of Social Science Data: Integrating PROV with DDIIn 5th Annual European DDI User Conference, 2013
- A Proposed Solution to the Archiving and Curation of Confidential Scientific InputsIn Privacy in Statistical Databases, 2012
- Using linked employer-employee data to investigate the speed of adjustment in downsizing firms in Canada and the USIn International census research data center conference, Oct 2009
- How protective are synthetic dataIn Privacy in statistical database, Sep 2008
-
Book Chapters (12)
- Improving Privacy for Respondents in Randomized Controlled Trials: A Differential Privacy ApproachIn Data Privacy Protection and the Conduct of Applied Research: Methods, Approaches and New Findings, 2026Longer version: https://doi.org/10.48550/arXiv.2309.14581 (v1). Corrected version: https://doi.org/10.48550/ARXIV.2309.14581 (v2).
- Using Containers for Analysis Validation at ScaleIn Data Privacy Protection and the Conduct of Applied Research: Methods, Approaches and New Findings, 2026For expanded version, see https://doi.org/10.1162/99608f92.4d1853ce
- Protecting Confidential Data through Non-Statistical MethodsIn Handbook of sharing confidential data: differential privacy, secure multiparty computation, and synthetic data, 2024
- Disclosure Limitation and Confidentiality Protection in Linked DataIn Administrative Records for Survey Methodology, Apr 2021
- Using Administrative Data for Research and Evidence-Based Policy: An IntroductionIn Handbook on Using Administrative Data for Research and Evidence-based Policy, Jan 2021
- Balancing Privacy and Data Usability: An Overview of Disclosure Avoidance MethodsIn Handbook on Using Administrative Data for Research and Evidence-based Policy, Jan 2021
- Physically Protecting Sensitive DataIn Handbook on Using Administrative Data for Research and Evidence-based Policy, Jan 2021
- The LEHD Infrastructure Files and the Creation of the Quarterly Workforce IndicatorsIn Producer Dynamics: New Evidence from Micro Data, 2009
- The link between human capital, mass layoffs, and firm deathsIn Producer dynamics: New evidence from micro data, 2009
- Adjusting imperfect data: Overview and case studiesIn The structure of wages: An international comparison, Jan 2009
- How did universal primary education affect returns to education and labor market participation in uganda?In Youth in africa’s labor market, 2008
- Early career experiences and later career outcomes: An InternationalComparisonIn Human capital over the life cycle - A European perspective, 2004Section: 5
Books (1)
- Handbook on Using Administrative Data for Research and Evidence-based PolicyJan 2021
Data & Software (27)
-
-
-
- labordynamicsinstitute/readin_qcew_sas: A sequence of programs to readin in QCEW data from the Bureau of Labor StatisticsJun 2020
- Uncertainty in times of COVID-19: Raw survey dataLabor Dynamics Institute, [data] v20200622-clean, Jul 2020
- Replication code and data for: Recalculating ... How Uncertainty in Local Labor Market Definitions Affects Empirical FindingsOct 2020See also https://larsvilhuber.github.io/MobZ/README.html.
- Applying Data Synthesis for Longitudinal Business Data across Three Countries [data and code]May 2020See also https://labordynamicsinstitute.github.io/SyntheticLEAP/.
- Presentation: metajelo, a metadata package for journals to support external linked objectsFeb 2019
-
- larsvilhuber/clone-chetty-use-admin-data: Data behind the Chetty (2012) figure on Time Trends in the Use of Administrative DataOct 2018
-
-
- Synthetic population housing and person records for the United States2017
- Replication Materials for Disclosure Limitation and Confidentality Protection in Linked DataDec 2017itemType: dataset
- labordynamicsinstitute/rampnoise: Code for Multiplicative Noise InfusionDec 2017
- Larsvilhuber/Jobcreationblog: Replication For: How Much Do Startups Impact Employment Growth In The U.S.?2016
-
- Replication data for: National estimates of gross employment and job flows from the Quarterly Workforce Indicators with demographic and industry detail2014
- CED²AR: Comprehensive Extensible Data Documentation and Access Repository2013tex.howpublished: online resource
-
-
-
- VirtualRDC - Synthetic Data Server2010tex.howpublished: online resource tex.owner: vilhuber tex.timestamp: 2013.10.15
-
-
-
-