Wie kommen die Metadaten an ein Stück Content bei der New York Times?

Ein weiterer Eintrag aus meiner losen Reihe: Wie funktionieren eigentlich professionelle Content Management Systeme? Das ist ein Thema, das mich interessiert. Über Arc, das CMS der Washington Post, das mittlerweile auch als Produkt vertrieben wird, habe ich hier schon mal geschrieben. Wer mich von Twitter kennt, kennt meine Einlassungen vielleicht auch von da.

Aus meinem Feedreader habe ich im Bereich DAM jetzt eine interessante Innenansicht aus dem Prozess der New York Times gefischt. Hier ist die Prozessbeschreibung der Senior-Taxonomin bei der NYT:

Metadata and the Tagging Process at The New York Times – IPTC:

Once an asset (an article, slideshow, video or interactive feature) is created in the content management system, the categorization software is called. This software runs the text against the rules for subjects and then through the rules for entities (proper nouns). Once this process is complete, editors are presented with suggestions for each term type within our schema: subjects, organizations, people, locations and titles of creative works. The subject suggestions also contain a relevancy score. The editor can then choose tags from these suggestions to be assigned to an article. If they do not see a tag that they know is in the vocabulary suggested to them, the editors have the option to search for that term within the vocabulary. If there are new entities in the news, the editors can request that they be added as new terms. Once the article is published/republished the tags chosen from the vocabulary are assigned to the article and the requested terms are sent to the Taxonomy Team.
Jennifer Parrucci
Senior Taxonomist at The New York Times

So etwas Ähnliches habe ich auch vor zehn Jahren oder so schon auf einer Ausgabe der dmexco-Vorgänger-Veranstaltung OMD in Düsseldorf gesehen (ja, so alt bin ich!). Da präsentierte man ein solches System bei Tomorrow Focus.

Dann umschreibe ich mal den Prozess in meinen Worten:

Content-Objekt wurde erstellt (vermutlich als fertig markiert)
Software prüft Text auf vorkommende Worte, Entitäten, Themen, Organisationen, Menschen und Titeln von Kunstwerken (Bücher und Musik dürften auch darunter fallen)
Software kreiert Vorschläge für die Tags
Redakteur stimmt zu oder lehnt Vorschläge ab
Redakteur kann weitere Vorschläge aus einem kontrollierten Vokabular auswählen
Artikel wird veröffentlicht
Bestätigte Tags werden an den Content gehängt
Gewünschte Tags werden zur Prüfung an Taxonomie-Team geschickt
Taxonomie-Team gibt Tags frei oder lehnt diese Vorschläge ab

Wow. Durchdacht, und zur Nachahmung empfohlen!

Photo by Marek Szturc on Unsplash

Krautsource

Gedanken zu digitalen Produkten

Wie kommen die Metadaten an ein Stück Content bei der New York Times?

Schreibe einen Kommentar Antwort abbrechen