tech.transparencia.document.mxdof.note

transparencia.pds.transparencia.tech

Documentation

Gazette-specific metadata for a single DOF nota. Strong-references the universal document.item record that carries title, retrieval, publisher, and dates. This sidecar only carries fields that wouldn't make sense outside the DOF context (SIDOF codes, matutina/vespertina edition, dependencia/organismo hierarchy, page/order within the diario PDF).

main record

Gazette-specific metadata for a single DOF nota. Strong-references the universal document.item record that carries title, retrieval, publisher, and dates. This sidecar only carries fields that wouldn't make sense outside the DOF context (SIDOF codes, matutina/vespertina edition, dependencia/organismo hierarchy, page/order within the diario PDF).

Record Key any Any valid record key

Properties

authorityLevel string Optional

Normalized authority tier derived from dependencia/organismo. Useful for analytics without replacing the raw text fields. Open set; pipelines should add new tiers as edge cases appear.

maxLength: 64 bytes
Known values: federal_executive, federal_judicial, federal_legislative, autonomous_constitutional_body, state_judicial, state_executive, state_legislative, municipal, state_owned_enterprise, decentralized_body, procurement_section, judicial_notices_section, other
codDiario integer Required

SIDOF identifier for the parent diario (gazette edition). Groups all notas published together in the same daily edition (matutina, vespertina, or extraordinaria).

minimum: 1
codNota integer Required

Primary DOF nota identifier from SIDOF. UNIQUE across all DOF notas. This is the key external identifier for deduplication and for constructing public DOF URLs.

minimum: 1
codSeccion string Optional

Section code within the diario as published by SIDOF (e.g., 'PRIMERA', 'SEGUNDA', 'UNICA', 'AVISOS JUDICIALES', 'LICITACIONES'). Free-form; conventions vary by edition.

maxLength: 128 bytesmaxGraphemes: 64 graphemes
contentTextAvailable boolean Optional

Whether normalized plain text was successfully captured by the ingestion pipeline (separate from whether the source published HTML/PDF; reflects what we actually extracted).

createdAt string datetime Required

When this AT Protocol record was created.

dependencia string Optional

Top-level authority label from SIDOF (nombreCodOrgaUno). Examples: 'PODER EJECUTIVO', 'ORGANISMOS AUTONOMOS', 'AVISOS JUDICIALES Y GENERALES', 'PROCURACIONES'. Kept verbatim from source for provenance; the document.item.issuingBodies array carries the structured form.

maxLength: 512 bytesmaxGraphemes: 128 graphemes
documentClass string Optional

Broad publication family. Optional at the raw-record level because for many notas it can only be determined from content (and is then mirrored in tech.transparencia.document.mxdof.note.enrichment.documentClass).

maxLength: 64 bytes
Known values: normative, judicial, procurement, intergovernmental, planning, administrative, informational, other
edition string Required

Edition label for the parent diario. Lowercased token form (the SIDOF API returns capitalized labels; pipelines should normalize on write).

maxLength: 32 bytes
Known values: matutina, vespertina, extraordinaria
hasDoc boolean Optional

Whether SIDOF reports an attached document artifact beyond HTML/PDF (existeDoc = 'S').

hasHtml boolean Optional

Whether SIDOF reports an HTML content page for this nota (existeHtml = 'S').

hasImage boolean Optional

Whether SIDOF reports image attachments (existeImagen = 'S').

hasPdf boolean Optional

Whether SIDOF reports PDF availability for the nota or its enclosing diario (existePdf = 'S').

issuingAuthority string Optional

Normalized human-readable issuer name when the pipeline produces one (e.g., 'Secretaría de Salud' instead of raw 'SECRETARIA DE SALUD'). Optional — many notas only expose dependencia-level information.

maxLength: 512 bytesmaxGraphemes: 128 graphemes
item ref com.atproto.repo.strongRef Required

Strong reference to the tech.transparencia.document.item record that holds the universal document metadata (title, retrieval URLs, publisher, dates, jurisdiction).

order string Optional

Ordering value from the SIDOF feed used to preserve sequence inside the diario/section. Encoded as a decimal string because AT Protocol does not support floats.

maxLength: 32 bytes
organismo string Optional

More specific issuing body from SIDOF (codOrgaDos). Examples: 'SECRETARIA DE ENERGIA', 'INSTITUTO NACIONAL ELECTORAL', 'SUPREMA CORTE DE JUSTICIA DE LA NACION'.

maxLength: 512 bytesmaxGraphemes: 128 graphemes
page integer Optional

First printed page number within the parent diario PDF when available.

minimum: 1
pageUntil integer Optional

Last printed page number when the nota spans multiple pages.

minimum: 1
pdfStoragePath string Optional

Internal storage path for the diario PDF when retained in object storage (e.g., 'YYYY/MM/DD/diario_{codDiario}.pdf'). Not a public URL — for analytics provenance only.

maxLength: 1024 bytesmaxGraphemes: 256 graphemes
rawImportedAt string datetime Optional

When the raw SIDOF payload was first ingested by the pipeline.

tipoNota string Optional

Raw tipo_nota from the SIDOF feed when available. Common examples include 'AVISO', but many high-signal normative records arrive with null tipo_nota (the actual document class is then derived by the enrichment layer's tipoActo field).

maxLength: 128 bytesmaxGraphemes: 64 graphemes
updatedAt string datetime Optional

When this record was last materially updated.

View raw schema
{
  "key": "any",
  "type": "record",
  "record": {
    "type": "object",
    "required": [
      "item",
      "codNota",
      "codDiario",
      "edition",
      "createdAt"
    ],
    "properties": {
      "item": {
        "ref": "com.atproto.repo.strongRef",
        "type": "ref",
        "description": "Strong reference to the tech.transparencia.document.item record that holds the universal document metadata (title, retrieval URLs, publisher, dates, jurisdiction)."
      },
      "page": {
        "type": "integer",
        "minimum": 1,
        "description": "First printed page number within the parent diario PDF when available."
      },
      "order": {
        "type": "string",
        "maxLength": 32,
        "description": "Ordering value from the SIDOF feed used to preserve sequence inside the diario/section. Encoded as a decimal string because AT Protocol does not support floats."
      },
      "hasDoc": {
        "type": "boolean",
        "description": "Whether SIDOF reports an attached document artifact beyond HTML/PDF (existeDoc = 'S')."
      },
      "hasPdf": {
        "type": "boolean",
        "description": "Whether SIDOF reports PDF availability for the nota or its enclosing diario (existePdf = 'S')."
      },
      "codNota": {
        "type": "integer",
        "minimum": 1,
        "description": "Primary DOF nota identifier from SIDOF. UNIQUE across all DOF notas. This is the key external identifier for deduplication and for constructing public DOF URLs."
      },
      "edition": {
        "type": "string",
        "maxLength": 32,
        "description": "Edition label for the parent diario. Lowercased token form (the SIDOF API returns capitalized labels; pipelines should normalize on write).",
        "knownValues": [
          "matutina",
          "vespertina",
          "extraordinaria"
        ]
      },
      "hasHtml": {
        "type": "boolean",
        "description": "Whether SIDOF reports an HTML content page for this nota (existeHtml = 'S')."
      },
      "hasImage": {
        "type": "boolean",
        "description": "Whether SIDOF reports image attachments (existeImagen = 'S')."
      },
      "tipoNota": {
        "type": "string",
        "maxLength": 128,
        "description": "Raw tipo_nota from the SIDOF feed when available. Common examples include 'AVISO', but many high-signal normative records arrive with null tipo_nota (the actual document class is then derived by the enrichment layer's tipoActo field).",
        "maxGraphemes": 64
      },
      "codDiario": {
        "type": "integer",
        "minimum": 1,
        "description": "SIDOF identifier for the parent diario (gazette edition). Groups all notas published together in the same daily edition (matutina, vespertina, or extraordinaria)."
      },
      "createdAt": {
        "type": "string",
        "format": "datetime",
        "description": "When this AT Protocol record was created."
      },
      "organismo": {
        "type": "string",
        "maxLength": 512,
        "description": "More specific issuing body from SIDOF (codOrgaDos). Examples: 'SECRETARIA DE ENERGIA', 'INSTITUTO NACIONAL ELECTORAL', 'SUPREMA CORTE DE JUSTICIA DE LA NACION'.",
        "maxGraphemes": 128
      },
      "pageUntil": {
        "type": "integer",
        "minimum": 1,
        "description": "Last printed page number when the nota spans multiple pages."
      },
      "updatedAt": {
        "type": "string",
        "format": "datetime",
        "description": "When this record was last materially updated."
      },
      "codSeccion": {
        "type": "string",
        "maxLength": 128,
        "description": "Section code within the diario as published by SIDOF (e.g., 'PRIMERA', 'SEGUNDA', 'UNICA', 'AVISOS JUDICIALES', 'LICITACIONES'). Free-form; conventions vary by edition.",
        "maxGraphemes": 64
      },
      "dependencia": {
        "type": "string",
        "maxLength": 512,
        "description": "Top-level authority label from SIDOF (nombreCodOrgaUno). Examples: 'PODER EJECUTIVO', 'ORGANISMOS AUTONOMOS', 'AVISOS JUDICIALES Y GENERALES', 'PROCURACIONES'. Kept verbatim from source for provenance; the document.item.issuingBodies array carries the structured form.",
        "maxGraphemes": 128
      },
      "documentClass": {
        "type": "string",
        "maxLength": 64,
        "description": "Broad publication family. Optional at the raw-record level because for many notas it can only be determined from content (and is then mirrored in tech.transparencia.document.mxdof.note.enrichment.documentClass).",
        "knownValues": [
          "normative",
          "judicial",
          "procurement",
          "intergovernmental",
          "planning",
          "administrative",
          "informational",
          "other"
        ]
      },
      "rawImportedAt": {
        "type": "string",
        "format": "datetime",
        "description": "When the raw SIDOF payload was first ingested by the pipeline."
      },
      "authorityLevel": {
        "type": "string",
        "maxLength": 64,
        "description": "Normalized authority tier derived from dependencia/organismo. Useful for analytics without replacing the raw text fields. Open set; pipelines should add new tiers as edge cases appear.",
        "knownValues": [
          "federal_executive",
          "federal_judicial",
          "federal_legislative",
          "autonomous_constitutional_body",
          "state_judicial",
          "state_executive",
          "state_legislative",
          "municipal",
          "state_owned_enterprise",
          "decentralized_body",
          "procurement_section",
          "judicial_notices_section",
          "other"
        ]
      },
      "pdfStoragePath": {
        "type": "string",
        "maxLength": 1024,
        "description": "Internal storage path for the diario PDF when retained in object storage (e.g., 'YYYY/MM/DD/diario_{codDiario}.pdf'). Not a public URL — for analytics provenance only.",
        "maxGraphemes": 256
      },
      "issuingAuthority": {
        "type": "string",
        "maxLength": 512,
        "description": "Normalized human-readable issuer name when the pipeline produces one (e.g., 'Secretaría de Salud' instead of raw 'SECRETARIA DE SALUD'). Optional — many notas only expose dependencia-level information.",
        "maxGraphemes": 128
      },
      "contentTextAvailable": {
        "type": "boolean",
        "description": "Whether normalized plain text was successfully captured by the ingestion pipeline (separate from whether the source published HTML/PDF; reflects what we actually extracted)."
      }
    }
  },
  "description": "Gazette-specific metadata for a single DOF nota. Strong-references the universal document.item record that carries title, retrieval, publisher, and dates. This sidecar only carries fields that wouldn't make sense outside the DOF context (SIDOF codes, matutina/vespertina edition, dependencia/organismo hierarchy, page/order within the diario PDF)."
}

Lexicon Garden

@