feat: add support for parsely published date, title, and author#865
Conversation
|
Thanks for these, but as you mentioned I'm a bit torn on if this make sense to add. One the on hand, these tags seem widely enough used to include but does seem to be repeated info. Is this capturing different metadata we wouldn't get from the JSON-LD already or just in case a site includes these tags but not the JSON? Looking through the JSON-LD description in https://docs.parse.ly/metadata-jsonld/, they have a few |
|
I'd say it's mainly for sites that don't include JSON-LD. I've run into a few others like these too. eg: I have an open issue right now for Adding these will be a bit repetitive in the codebase but it would make metadata capture much more consistent. I'll include the JSON-LD equivalents for anything new as well. |
cmkm
left a comment
There was a problem hiding this comment.
Took a look at this with Gijs, and while Fred's note about this potentially duplicating data is certainly valid, I think the benefit of capturing more non-JSON-LD metadata outweighs that issue. Thank you for your contribution! :)
Ports "feat: add support for parsely published date, title, and author" (mozilla/readability#865)
Ports "feat: add support for parsely published date, title, and author" (mozilla/readability#865)
Adds Parsely tags as a fallback option for metadata. Parsely is a content analytics service aimed at larger publishers running Wordpress, eg: The Verge.
It's worth noting that Parsely tags are unlikely to exist in isolation and seem to be populated alongside
ogtags and JSONLD data in nearly all cases. I would like to add other tag sets which will be more valuable though and this was a nice simple one to familiarize myself with. I will totally understand if the preference is to keep the regex patterns from growing too large by leaving out less common sources of metadata like this.