Article 42

Data access and scrutiny

Providers of very large online platforms or of very large online search engines shall provide the Digital Services Coordinator of establishment and the Commission, upon their reasoned request and within a reasonable period, as specified in the request, access to data that are necessary to monitor and assess compliance with this Regulation, provided that those data are publicly accessible in their online interface or have been provided or obtained on the basis of this Regulation, such as data provided pursuant to Articles 33, 34, 36 and 37.

Providers of very large online platforms or of very large online search engines shall furthermore provide access to data to vetted researchers who meet the requirements laid down in paragraph 4, under the specific conditions set out in Article 40 of Regulation (EU) 2022/2065.

The Commission may, upon a reasoned request of a vetted researcher, require the provider of a very large online platform or of a very large online search engine to provide access to such data.

Understanding This Article

Article 42 addresses a fundamental information asymmetry problem that has plagued digital platform governance since the internet's commercialization: Very Large Online Platforms possess vast, comprehensive data about their operations, user behaviors, content ecosystems, algorithmic systems, content moderation decisions, and societal impacts, while regulators, researchers, journalists, and civil society operate largely blind, dependent on platform voluntary disclosure, leaked information, or limited observable data. This asymmetry prevents effective oversight, makes independent verification of platform claims impossible, enables platforms to make unverifiable assertions about their practices, and prevents evidence-based policymaking. Article 42 mandates data access for two distinct categories addressing different but complementary needs: (1) regulatory authorities (Digital Services Coordinators and Commission) for compliance monitoring, enforcement, designation decisions, and regulatory oversight; and (2) vetted academic researchers for independent scholarly study of platform systems, societal impacts, algorithmic behavior, and systemic risks.

For regulatory access (paragraph 1), VLOPs/VLOSEs must provide data 'upon reasoned request' - regulators cannot demand arbitrary data fishing expeditions or exploratory requests without justification, but must articulate specific data needs for legitimate supervisory purposes under the DSA. The request must identify the data sought, explain why it is necessary for compliance monitoring or assessment, and specify a 'reasonable period' for provision. Reasonable period balances regulatory urgency (some investigations require rapid data access) against platform complexity in extracting, processing, anonymizing, and preparing data for transfer. Simple requests for already-compiled public data might require days; complex requests for raw algorithmic data requiring processing might require weeks. Accessible data scope includes: (a) 'publicly accessible data in online interface' - content, advertisements, user interfaces, transparency reports, terms of service already publicly available but which platforms could make difficult to access at scale or systematically. Article 42 ensures regulators can systematically access, analyze, and archive public-facing platform information without technical barriers, rate limiting, or access restrictions. (b) Data 'provided or obtained on the basis of this Regulation' - specifically data submitted under Articles 33 (VLOP designation including user numbers and market data), 34 (risk assessment reports including methodologies and findings), 36 (independent audit reports including detailed findings), 37 (annual implementation reports documenting compliance measures). If platform submitted risk assessment to DSC, regulator can request underlying data supporting assessments, methodologies used, alternative analyses conducted. This enables regulators to verify platform claims, conduct independent analysis of submitted reports, and assess completeness and accuracy of compliance documentation. Importantly, scope is explicitly limited to data 'necessary to monitor and assess compliance' - not blanket access to all platform data, but targeted access for specific regulatory purposes. This protects platform trade secrets, user privacy, and operational confidentiality while enabling effective oversight.

For researcher access (paragraphs 2-3), Article 42 in conjunction with Article 40 creates unprecedented mandatory academic access to platform data enabling independent scholarly scrutiny. Pre-DSA, researchers depended on platform cooperation which was voluntary, selective, could be terminated unilaterally (Meta's Social Science One initiative was terminated amid disputes), and subject to platform control over research questions and publication. Article 42 creates legal entitlement to data access for qualified researchers meeting Article 40 criteria. Researchers must be 'vetted' through formal process examining: affiliation with academic/research institutions, independence from commercial interests and platform influence, qualifications to handle data responsibly including technical capabilities and methodological expertise, ability to protect privacy and security through secure data handling, compliance with research ethics standards, commitment to legitimate research purposes serving DSA objectives (understanding systemic risks, evaluating platform practices, informing policy). Vetting prevents bad-faith actors obtaining data under research pretense (competitors, malicious actors, journalists without research training) while enabling legitimate academic inquiry. Access conditions specified in Article 40 and the July 2025 Delegated Act include: purpose limitations (research must serve DSA monitoring purposes understanding systemic risks, not general curiosity or commercial purposes), confidentiality requirements protecting sensitive data, anonymization where necessary to protect privacy, prohibition on re-identification attempts, secure data handling procedures, restrictions on data sharing beyond research team, publication requirements and timelines, cooperation with platform reasonable security measures.

Paragraph 3's Commission power to compel data access upon vetted researcher request creates crucial enforcement mechanism addressing platform obstruction. If VLOP denies researcher access claiming request unreasonable, too broad, technically infeasible, or threatens trade secrets, researcher can appeal to Commission. Commission evaluates reasonableness considering research purposes, data sensitivity, technical feasibility, privacy implications, and platform concerns. If Commission finds request reasonable and access denial unjustified, Commission can order access, specify conditions protecting platform interests, and enforce compliance. This prevents platforms stonewalling legitimate research while protecting against genuinely unreasonable demands through Commission mediation.

Article 42's transformative potential lies in converting platform operations from proprietary black boxes operating on trust and self-regulation into objects of systematic scholarly and regulatory inquiry subject to verification and evidence. This enables research previously impossible: Algorithm impacts on information diversity, filter bubbles, polarization; Content moderation error rates, biases against particular viewpoints or communities, inconsistent enforcement; Misinformation spread patterns, viral dynamics, platform amplification of false information; Targeting discrimination in advertising, especially housing/employment/credit; Amplification of harmful content through recommender systems; Minor protection effectiveness, exposure to harmful content; Election interference patterns, foreign influence operations, coordinated inauthentic behavior; Systemic risk manifestation and mitigation effectiveness. These research areas directly serve DSA's objectives understanding and mitigating systemic risks VLOPs pose.

Significant implementation challenges remain requiring regulatory guidance, platform-researcher-regulator negotiation, and evolving practice. What constitutes 'reasonable' data request? How to balance research value against platform burden? Can platforms charge cost-recovery fees or must access be free? How to protect user privacy under GDPR while enabling meaningful research? How to protect platform trade secrets (proprietary algorithms) while enabling algorithmic accountability? What technical formats should data access take (APIs, data downloads, secure research environments)? These questions are being resolved through Commission implementing acts, DSC coordination through European Board for Digital Services, the July 2025 Delegated Act establishing procedures and the DSA data access portal, emerging case law, and practical experience as researchers submit requests and platforms respond.

Key Points

  • VLOPs/VLOSEs must provide data access to Digital Services Coordinators and Commission upon reasoned requests for compliance monitoring and enforcement
  • Access limited to data necessary for monitoring compliance and assessing systemic risks, preventing arbitrary data fishing expeditions while enabling targeted oversight
  • Includes publicly accessible data and data provided under DSA obligations (Articles 33 designation data, 34 risk assessments, 36 audit reports, 37 transparency reports)
  • Must be provided within reasonable period specified in regulator request, balancing regulatory urgency with technical feasibility of data extraction
  • VLOPs/VLOSEs must provide data access to vetted researchers meeting Article 40 requirements for independent academic scrutiny of platform systems
  • Researcher vetting ensures qualified academics with appropriate safeguards, independence from commercial interests, privacy protection capabilities, and legitimate research purposes
  • Commission can compel data access for vetted researchers upon reasoned request, creating enforcement mechanism when platforms deny legitimate research
  • Delegated Act adopted July 2025 establishes procedures, technical conditions, DSA data access portal, and vetting processes for researcher access
  • Enables independent research on algorithmic impacts, content moderation biases, misinformation spread, targeting discrimination, minor protection effectiveness, election interference
  • Transforms platform oversight from opacity and trust to transparency and verification, addressing fundamental information asymmetry in digital platform governance

Practical Application

For Meta (Establishing Comprehensive Data Access Infrastructure): Meta must implement robust data access infrastructure serving both regulatory and researcher needs for Facebook, Instagram, WhatsApp where designated VLOP. Implementation components: (1) Regulatory access portal: Dedicated secure system for DSCs and Commission to submit data requests, track request status, receive data deliveries. System requirements: Accept reasoned requests with structured justification fields explaining compliance monitoring needs, acknowledge receipt automatically with tracking numbers, assign internal request handlers from legal/policy/technical teams, provide estimated delivery timelines based on request complexity, deliver data in structured formats (APIs for real-time data, datasets for historical analysis, secure file transfer for large volumes), document precisely what was provided creating audit trail. (2) Researcher access program: Separate portal integrated with DSA data access portal for vetted researchers: Researcher registration and vetting status verification (coordinated with DSC/Commission vetting), project proposal submission with research questions, methodology, data needs, privacy protection measures, data access controls providing read-only interfaces, restricted download capabilities with watermarking, comprehensive access logging, time-limited access aligned with research timelines, privacy protection through anonymization, aggregation, secure research environments preventing raw data export, monitoring researcher compliance with access conditions, support resources including documentation, technical assistance, example queries. (3) Internal workflows when request received: Legal review (is request within Article 42 scope? does it meet 'necessary for compliance monitoring' standard? are there legal obstacles like GDPR, trade secrets, ongoing litigation?), technical assessment (can we extract this data? how long will processing take? what format is appropriate? are there technical limitations?), privacy analysis (does GDPR require anonymization, aggregation, or other privacy protections? can we provide data while protecting user privacy? do we need Data Protection Impact Assessment?), trade secret review (does this expose proprietary algorithms, business intelligence, or competitive information requiring safeguards like confidentiality agreements, restricted publication, redaction?), data preparation using automated pipelines where possible, quality assurance verifying data accuracy and completeness, secure delivery with appropriate access controls. (4) Response timeframes: Meta should establish internal SLAs balancing regulatory needs with operational feasibility: Acknowledge requests within 48 hours, provide feasibility assessment and timeline within 5 business days, deliver simple public data requests within 2 weeks, deliver complex data requiring processing within 4-6 weeks, communicate delays proactively with explanations, offer interim data or alternative datasets meeting similar purposes when full request delayed. (5) Dispute resolution process: If Meta believes request unreasonable, disproportionate, technically impossible, or threatening critical interests: Engage with requesting regulator/researcher explaining specific concerns, propose alternative data or modified scope meeting similar research/oversight goals, document disagreement with detailed justification, participate in Commission mediation if escalated, comply with Commission orders while preserving appeal rights for genuine disputes. Compliance challenges: Scale of potential requests (dozens of researchers × multiple projects + multiple regulators conducting various investigations), resource costs (engineering time, storage infrastructure, security measures, legal review), privacy protection (ensuring GDPR compliance while providing meaningful data), trade secret protection (enabling algorithmic accountability without revealing complete proprietary algorithms), quality assurance (ensuring provided data is accurate, complete, and properly documented), ongoing access maintenance (researcher projects may span months requiring continuous access). Meta must balance transparency obligations with legitimate operational and privacy concerns, investing in automated data access infrastructure reducing manual effort, establishing clear policies on what can/cannot be shared, and engaging constructively with regulators and researchers to enable oversight while protecting core interests.

For Researchers (Navigating Vetting and Conducting Platform Research): Academic researchers seeking platform data access under Article 42 must navigate vetting process, design compliant research, and work within access conditions. Process: (1) Vetting application: Researcher submits application to designated DSC (typically DSC of Member State where researcher's institution is located) through DSA data access portal (operational since October 2025). Application includes: Researcher credentials (academic position, institutional affiliation, relevant expertise), research proposal (questions, hypotheses, methodology, societal relevance to DSA objectives), data needs specification (what platform data is necessary, why it's necessary, how it will be analyzed), privacy and security measures (how data will be protected, stored, accessed, analyzed, destroyed after research), independence declaration (no conflicts of interest, commercial interests, platform funding affecting research), ethics approval from institutional review board, publication plans (commitment to publish findings, timelines, pre-publication notification to platform). DSC reviews application assessing: Researcher qualifications (legitimate academic researcher with appropriate expertise), research legitimacy (serves DSA purposes understanding systemic risks, platforms impacts, not general curiosity or commercial purposes), data necessity (requested data is proportionate and necessary for stated research purposes), privacy/security capacity (researcher can protect data appropriately), independence (no disqualifying conflicts of interest). DSC may request additional information, propose modifications, or deny applications lacking necessary criteria. Approved researchers receive vetting certificate enabling data requests to platforms. (2) Data request submission: Vetted researcher submits data request to specific VLOP/VLOSE through DSA portal or platform's researcher access portal. Request specifies: Vetting certification, research project, specific data requested (content data, user behavior data, algorithmic data, moderation data), time period, proposed analysis methods, privacy protections, access duration needed. Platform must respond within reasonable timeframe specified in Delegated Act. (3) Negotiation and access: Platform may negotiate request scope, propose alternative data meeting research needs with less privacy/security risk, impose conditions protecting trade secrets (confidentiality agreements, restricted publication of proprietary details, secure research environments preventing data export). Researcher and platform should engage constructively finding mutually acceptable access balancing research needs and platform interests. If platform denies access or imposes unreasonable conditions, researcher can appeal to Commission which may order access. (4) Conducting research: Researcher accesses data under agreed conditions, typically through: Secure APIs providing query access, data downloads with use restrictions, secure research environments (platform-provided or third-party) where data analysis occurs without raw data export, aggregated data releases protecting individual privacy. Researcher must comply with access conditions: Use data only for stated research purposes, implement security measures, not attempt re-identification, not share data beyond approved team, acknowledge platform cooperation while maintaining research independence, notify platform before publication (reasonable pre-publication review allowing platform to identify errors, request confidential information redaction, without censorship rights). (5) Publication and impact: Researcher publishes findings in academic journals, policy reports, public communications. Research contributes to understanding platform impacts, informs regulatory decisions, creates public accountability pressure on platforms, advances scientific knowledge. Example research projects: Analyzing recommender algorithm impacts on political polarization using Facebook/YouTube user interaction data, examining content moderation error rates and potential biases using decision data, studying misinformation spread patterns, evaluating minor protection measures' effectiveness, documenting discriminatory ad targeting. Challenges researchers face: Vetting delays (applications may take weeks/months), platform delays or obstruction (platforms may delay responses, impose restrictive conditions, deny access requiring Commission intervention), technical complexity (platform data often huge, complex, requiring specialized analysis skills), privacy protection requirements (anonymization, secure analysis environments limiting analysis flexibility), publication restrictions (platform pre-publication review, trade secret redactions), resource requirements (storage, computing, security infrastructure). Despite challenges, Article 42 creates unprecedented research access enabling independent platform accountability research previously impossible.

For Regulators (Using Data Access for Supervisory Activities): Digital Services Coordinators and Commission use Article 42 data access for various supervisory activities: (1) Designation verification: When assessing whether platform meets VLOP threshold (45 million average monthly active users in EU), Commission requests user data, engagement metrics, geographic distribution. Platform must provide data supporting user counts enabling Commission to verify designation or dispute platform claims. (2) Risk assessment evaluation: After platform submits Article 34 risk assessment, DSC requests underlying data: How did platform identify systemic risks? What data supported conclusions? What alternative risk assessments were considered? Underlying data enables regulator to evaluate assessment quality, comprehensiveness, and accuracy rather than accepting platform claims at face value. (3) Audit verification: After Article 37 independent audit, regulator may request audit working papers, platform data auditor examined, additional data verifying audit findings. This enables regulator to assess audit quality, verify auditor independence, conduct spot checks. (4) Complaint investigation: When DSC investigates potential DSA violation based on complaints, media reports, or proactive monitoring, data access enables evidence gathering. Example: Complaints allege platform's recommender system disproportionately amplifies misinformation. DSC requests: Content circulation data showing which content is amplified, recommender algorithm parameters and training data, A/B testing results for algorithm changes, internal studies on misinformation amplification. Data enables evidence-based investigation determining whether systemic risk exists and whether platform's mitigation measures are adequate. (5) Enforcement proceedings: When Commission pursues enforcement action for DSA non-compliance, data access provides evidence. Rather than relying on platform self-reporting or external observations, Commission can request comprehensive data documenting non-compliance, its extent, duration, and impact. (6) Policy development: Commission uses aggregated cross-platform data for policy development, evaluating whether DSA provisions work as intended, identifying emerging risks requiring regulatory response, informing guidance and best practices. Regulatory best practices: Issue targeted requests (specific data for specific purposes, not exploratory fishing), justify requests clearly (explain compliance monitoring need), specify reasonable timelines (balance urgency with technical feasibility), protect confidential information (establish secure data handling, limit access to need-to-know personnel), respect privacy (require anonymization where appropriate, comply with GDPR), engage constructively (work with platforms to refine requests meeting oversight needs while respecting legitimate platform interests), document requests and responses (maintain records for accountability, appeals, future reference), share insights (coordinate among DSCs through European Board sharing lessons learned, common approaches, data analysis methodologies). Article 42 shifts regulatory approach from reactive, complaint-driven, and dependent on platform voluntary disclosure to proactive, evidence-based, and verification-oriented. Regulators can test platform claims empirically, identify problems through data analysis, and base enforcement on comprehensive evidence rather than assumptions.

For Platforms (Compliance Strategy and Infrastructure): VLOPs/VLOSEs should develop comprehensive Article 42 compliance strategies: (1) Infrastructure investment: Build or procure data access infrastructure supporting regulatory and researcher requests at scale. Options: Develop custom portals integrated with internal data systems, implement Research API platforms like Meta's Social Science One 2.0, partner with third-party data trust organizations providing secure research environments, use Commission's DSA data access portal. Infrastructure should automate data extraction, anonymization, access logging, and delivery reducing manual effort. (2) Request management process: Establish cross-functional team (legal, policy, engineering, privacy, security, research) managing requests. Create standard procedures: Request intake and tracking, feasibility and scope assessment, privacy/security/trade secret review, data preparation and quality assurance, delivery and access provisioning, ongoing monitoring for researcher projects, documentation and audit trails. (3) Privacy protection: Implement robust anonymization, aggregation, secure access methods protecting user privacy under GDPR while enabling meaningful research. Techniques: K-anonymity ensuring individuals not identifiable, differential privacy adding statistical noise, aggregation providing summary statistics not individual data, secure research environments restricting data export, access controls limiting who can access what data. Consult Data Protection Authorities ensuring approaches GDPR-compliant. (4) Trade secret protection: Develop policies on what algorithmic and business information can be shared, what requires confidentiality protections. For sensitive algorithms, consider: Providing algorithm outputs/behaviors not complete source code, offering restricted access in secure environments, requiring confidentiality agreements, allowing publication of findings while restricting proprietary details. Balance algorithmic accountability with legitimate IP protection. (5) Researcher engagement: Engage constructively with researchers finding mutually acceptable access. Recognize researcher access serves public interest and DSA compliance, not threat. Provide documentation, technical support, reasonable access timelines. Build reputation as research-friendly encouraging quality research over confrontational relationships. (6) Regulatory cooperation: Respond promptly and completely to regulator requests. View data access as collaboration enabling evidence-based regulation benefiting platforms (clear expectations, predictable enforcement) and public interest (effective oversight). Proactively offer relevant data, identify potential issues, propose solutions. (7) Transparency: Publish transparency reports documenting data access requests received, processing times, data provided, denials with justifications. Transparency builds trust with regulators, researchers, and public. Platforms investing in Article 42 compliance infrastructure and constructive engagement will navigate data access obligations more smoothly, build better relationships with oversight community, and contribute to evidence-based platform governance ecosystem.