FEP-b2b8: Long-form Text
-
aschrijver:
They are only arbitrary when we don’t assign distinctive semantic meaning to them.
But is there any meaningful semantic difference?
- ActivityStreams says Article is just a multi-paragraph written work. Schema.org says it is specifically for news articles, but that's clearly not what this FEP is suggesting (and doesn't make sense with ActivityStreams' definition).
- Document is literally a tautology and is completely meaningless (the definition may as well have been "A document is a document").
- A Note's only distinguishing characteristic seems to be that it is short (schema.org's
Statement
is not at all howNote
s are currently used on the fediverse and is clearly not what this FEP is suggesting). Page
is currently used by Lemmy for all posts in communities. Page also inherits from Document, which is sort of confusing (aren't pages usually part of a document, not the other way around?). And what is a web page other than HTML? But all of these things are essentially just HTML.
My point is that these things are so tenuously defined that it becomes vacuous. They all just boil down to HTML, or less technically, what most people associate with any general "post" on social media (at least those that aren't restricted to short-form content).
In addition, these definitions aren't fitting how these types are used on the fediverse at all. For instance, comments on Lemmy are currently
Note
s but have no length restriction.EDIT: Even this post itself is posted on ActivityPub as a
Note
, despite having many paragraphs.The only actual meaningful distinction between these types seem to be their length, with an arbitrary distinction between single-paragraph and multi-paragraph. But we don't need a standard to tell each implementation where to put the border between "short-form" and "long-form." Each implementation, or even each client, can easily choose by itself what they consider to be "short-form" and "long-form" by simply checking the length themselves.
No, there's not much meaningful semantic difference even in the wild. Granted, use of non-
Note
types is still rather limited currently, but we can draw some expectations (which come with a hefty dose of exceptions):- A
Note
is shorter than anArticle
(unless it is not), and vice versa (unless it is not) - An
Article
contains inline images (unless there aren't any) Note
s tend to contain attachments (unless there aren't any)
... I could go on, but everything I'd say would come with "(unless..)" alongside it.
I think what evan@cosocial.ca is attempting to do with the FEP is assign some suggestions as to how to classify content, and suggesting that there could be display differences on the implementor side for each individual type (unless there aren't any differences ha ha ha)
> Personally I find the distinction between the
Note
,Document
,Article
andPage
types in the Activity Vocabulary entirely arbitrary and they ought to all just be the same type.The problem here is that
Note
is now loaded with expectations so as to become highly-specific. You can't use inline images, you must cap attachments at 4, you may have to re-order attachments, etc. -
@julian those four types are very different.
-
julian:
The problem here is that
Note
is now loaded with expectations so as to become highly-specific. You can't use inline images, you must cap attachments at 4, you may have to re-order attachments, etc.Says who? I don't see any such requirements in the spec. In Lemmy I can put as many images in a comment as a want. Here on Discourse I don't think there is a limit on any of these things either?
But again, if any implementation wants to handle content differently (like short or long form content, or content with lots of images, or whatever), then that's that implementation's imperative and you can't use these types to enforce anything anyway.
Handling all manner of arbitrary requirements from different implementations would also be way too complicated. Implementations should rather try to handle as broad a set of content as possible and display it in an appropriate way.
-
julian:
The problem here is that
Note
is now loaded with expectations so as to become highly-specific. You can't use inline images, you must cap attachments at 4, you may have to re-order attachments, etc.Says who? I don't see any such requirements in the spec. In Lemmy I can put as many images in a comment as a want. Here on Discourse I don't think there is a limit on any of these things either?
But again, if any implementation wants to handle content differently (like short or long form content, or content with lots of images, or whatever), then that's that implementation's imperative and you can't use these types to enforce anything anyway.
Handling all manner of arbitrary requirements from different implementations would also be way too complicated. Implementations should rather try to handle as broad a set of content as possible and display it in an appropriate way.
Says Mastodon, implicitly, because those are the restrictions you have to follow if you want your content adequately represented on there.
You can say it doesn't matter what Mastodon says, and you're right, but my users don't care about that, they just want their content displayed on Mastodon properly.
-
I wasn't aware Mastodon had such arbitrary requirements. I would say you should not handle this by restricting your users or your display of content to fit the whims of other implementations. That would only spread these arbitrary requirements.
One way to handle this is to display a warning to your users when they are writing a post. If they create a post that is incompatible with Mastodon or other known incompatible implementation, display a (small, nonintrusive) warning to your users that it may not display correctly or at all in those implementations.
Hopefully this would put pressure on Mastodon to remove these limitations or encourage people to use other implementations with fewer limitations.
-
I wasn't aware Mastodon had such arbitrary requirements. I would say you should not handle this by restricting your users or your display of content to fit the whims of other implementations. That would only spread these arbitrary requirements.
One way to handle this is to display a warning to your users when they are writing a post. If they create a post that is incompatible with Mastodon or other known incompatible implementation, display a (small, nonintrusive) warning to your users that it may not display correctly or at all in those implementations.
Hopefully this would put pressure on Mastodon to remove these limitations or encourage people to use other implementations with fewer limitations.
SorteKanin:That would only spread these arbitrary requirements.
This. I have advocated a lot for a standards movement that dares to set its own course depending on what's best for the ecosystem as a whole. But it is not what developers were interested or able to put their weight behind, and Mastodon stayed the post-facto interoperability leader plotting the direction. The activities around FEP's and Forum task force are a worthy effort to bring change to that reality.
SorteKanin:My point is that these things are so tenuously defined that it becomes vacuous.
I agree. Yet then I'd focus on defining the types better, rather than having a single type that only takes concrete shape in the eye of the beholder.
-
SorteKanin:
That would only spread these arbitrary requirements.
This. I have advocated a lot for a standards movement that dares to set its own course depending on what's best for the ecosystem as a whole. But it is not what developers were interested or able to put their weight behind, and Mastodon stayed the post-facto interoperability leader plotting the direction. The activities around FEP's and Forum task force are a worthy effort to bring change to that reality.
SorteKanin:My point is that these things are so tenuously defined that it becomes vacuous.
I agree. Yet then I'd focus on defining the types better, rather than having a single type that only takes concrete shape in the eye of the beholder.
aschrijver:Yet then I’d focus on defining the types better
I'm all for well-defined types and I think that ought to be used more in more places (for instance FEP-1b12 groups) but let's be clear about what a type should provide - data about how to process the given content or object, in ways that can't be inferred from the rest of the data itself. That is, the type is metadata, not data. But when trying to answer a question about the data, such as "how long is this content", you should look at the data, not the metadata. The data is the source of truth.
I think this problem really stems from the use of ActivityStreams and its unfortunate formulation of these types, which has multiple issues. The redundancy of certain types, their ambiguity and the way they seem forced into an object-oriented inhertiance hierarchy, just to name the ones I can think of off the top of my head. Unfortunately we can't easily change the vocabulary to use.
-
I think the semantics of fedi tend to be less about Note vs Article and more in the bucket of what SIOC calls a Post: http://rdfs.org/sioc/spec/#term_Post
They allow for subtypes like BlogPost, BoardPost, Comment, InstantMessage, MailMessage, WikiArticle. But those are less important than the fact that the Thing is a Post.
wrt the intention of AS2 Vocab:
- Note and Article are intended to be contentful types, i.e. they typically have
content
- Document is intended to be an artifact or record of information, with subclasses of Image/Video/Audio/Page
- Page is not the same as an Article. There might be an Article on the Page, and the Article might be the main subject of the Page, but they are not the same.
Consider the following:
Some Page
The equivalent would be:
{ "@context": {"as": "https://www.w3.org/ns/activitystreams#"}, "@graph": [ { "@id": "#page", "@type": "as:Page", "as:name": "Some Page" }, { "@id": "#article", "@type": "as:Article", "as:name": "Some Article" } ]}
In general what
Article
boils down to is a sort of formality, an intent to publish something as an Article is interpreted as an intent to render it more like a blog than a social media post. Note that this is still a pretty contrived distinction, but it's the best we have.Beyond that, I continue to have some issues with the FEP itself:
- Note and Article are intended to be contentful types, i.e. they typically have
-
Hello!
This is a discussion thread for the proposed FEP-b2b8: Long-form Text.Please use this thread to discuss the proposed FEP and any potential problemsor improvements that can be addressed.
SummaryMulti-paragraph text is an important content type on the Social Web. This FEP defines best practices for representing and using properties of a long-form text object in Activity Streams 2.0.
cc @eprodrom
Overall I'm fine with this FEP.
Mastodon uses 'summary' for content warnings so when Mastodon ignores this FEP and downgrades an article to a note they'll probably turn the summary into a content warning?
We could plan for Mastodon to drag the chain and just sidestep the issue by calling it something else?
-
Overall I'm fine with this FEP.
Mastodon uses 'summary' for content warnings so when Mastodon ignores this FEP and downgrades an article to a note they'll probably turn the summary into a content warning?
We could plan for Mastodon to drag the chain and just sidestep the issue by calling it something else?
Hi rimu@mastodon.nzoss.nz ! Actually two representatives from Mastodon were at the pre-FOSDEM meet to provide their input. Despite the illusion that they throw their weight around on purpose, it definitely is not the case.
In this case Mastodon actually already treats Article and Note differently! They are both converted into a "status", but content warnings only apply for Notes.
NodeBB already made the switch to publishing Articles with a
summary
property, and it is working quite well. It is not treated as a content warning (GtS on the other hand does! Funny how that works.)Future iterations call for using
preview
to represent the Note (for Mastodon and other microblogs) while the threadiverse and long-formers can ingest the article directly! -
julian:
Future iterations call for using
preview
to represent the Note (for Mastodon and other microblogs) while the threadiverse and long-formers can ingest the article directly!This is where I would point to my comment in https://codeberg.org/evanp/fep/issues/21#issuecomment-3765626
this use of
preview
muddies the waters because you can't necessarily assume that any property of the Article also apply to the preview Note.taking the FEP at its word:
Especially for microblogging applications, the
preview
property is a useful fallback for supporting unrecognized object types like Article [...] For an article, the preview can be a Note that gives a well-formatted preview of the article content in its content property. For example, thename
,summary
, and a link to theurl
would be an appropriate representation.if this is an appropriate representation to just take
name
summary
andurl
, then why not just do that directly? i mean, we already have a "useful fallback" described in AS2-Core: using thename
andsummary
! https://www.w3.org/TR/activitystreams-core/#text-representationsi understand that there is a tension between publishers wanting to be accurate and publish
Article
objects vs. consumers special-casingArticle
objects and using fallback representations for them, but i don't see howpreview
helps here -- you'd still be special-casing.what is really going on here is that you are dealing with a desire for multiple representations of the same resource, but you are forced to stuff it all into a single HTTP resource. so you are not really asking for a "preview", you are asking for an "alternate representation".
rel=alternate
, notrel=preview
.Matthias Pfefferle also raised this concern:
If we treat the preview Note as a standalone object that can be used as-is, we’d also need to include the context, replies, reposts, likes collections, audiences, and more. From my perspective, it’s impossible to ignore the surrounding object.
You can't just treat the
preview
as "the same object" and assume that any properties of the Article will also apply to the Note. They are separate objects, always.The more idiomatic way to have an object be represented or interpreted in multiple ways is to use specific vocabularies and classes as appropriate. For example, instead of quibbling about the difference between
Note
andArticle
, you could capture the semantics of a "post" which is what you really care about:{ "@id": "#something" "@type": [ "https://www.w3.org/ns/activitystreams#Note", "http://rdfs.org/sioc/ns#Post", "http://joinmastodon.org/ns#Status" ]}
Here, we are making the claim that
#something
is simultaneously a "Note", "Post", and "Status".The core of the UX issue is that Note is overloaded to mean "Mastodon-style status" instead of actually defining an unambiguous class to represent the set of all things that can be considered Mastodon-style statuses.
You might think that it's bad for interop if everyone defines their own classes for everything, but that's not what I'm saying. What I'm saying is that reusing or overloading a class is also bad for interop, probably even worse. Think of types and classes as a sort of interface that can be fulfilled -- if the interface is fulfilled, the current object/thing belongs to the class or set representing all things that fulfill the interface. You should reuse classes that already exist, but only if the semantics match EXACTLY. If the semantics aren't an exact match, then you have introduced ambiguity.
We might say that for a "Post", what we really care about is the "content", plus some other supporting properties. This is the level at which "social media post" use cases should operate. In some ways, the AS2 content model is a bad fit for those use cases, because the AS2 content model is very heavily oriented toward Activities rather than Posts. The problems of "longform text" are largely artificial.
-
@julian What is the purpose of
preview
if Mastodon can already rendersummary
? -
Hello!
This is a discussion thread for the proposed FEP-b2b8: Long-form Text.Please use this thread to discuss the proposed FEP and any potential problemsor improvements that can be addressed.
SummaryMulti-paragraph text is an important content type on the Social Web. This FEP defines best practices for representing and using properties of a long-form text object in Activity Streams 2.0.
cc @eprodrom
To me, the difference between an article and a note is not just its length, but also what formatting is allowed.
A Note typically is shorter but doesn't have to be, and has minimal formatting (typically limited to what Markdown or BBCode would allow).
Whereas an Article is more like a blog post or journalistic article, which typically contains formatting and layout beyond what BBCode or Markdown would allow.
This would be a much better distinction rather than using some arbitrary length as a factor. It would also solve the problem of blog posts looking weird when shown in a client.
-
Hello!
This is a discussion thread for the proposed FEP-b2b8: Long-form Text.Please use this thread to discuss the proposed FEP and any potential problemsor improvements that can be addressed.
SummaryMulti-paragraph text is an important content type on the Social Web. This FEP defines best practices for representing and using properties of a long-form text object in Activity Streams 2.0.
cc @eprodrom
Yes, that might work. Note having an agreed upon and well-defined constraint to its formatting, and a more general-purpose Article lacking that constraint.
For both types we should beware how base type definitions may or may not affect extended types. I think it not well defined what happens in terms of behavior for extended and/or multi-types (think e.g.
"type": ["as:Note", "my:StickyNote"]
).Golden Hammer: Article versus Note?
This section is a bit of an aside, not directly related to the FEP. Yet relates to the overall design direction of AS/AP and how that may cause new trouble in the future as things are adopted by the installed base.
Wikipedia quote: "[Golden hammer is] the comfort zone state where you don't change anything to avoid risk. The problem with using the same tools every time you can is that you don't have enough arguments to make a choice because you have nothing to compare to and is limiting your knowledge."
For
Article
schema.org suggests the following meaningful sub-types.- Article
And I copied that whole list because in AS/AP land by default the urge is to try to hammer every information model in the limited set of objects and activities that ActivityStreams provides, and not touch the whole extensibilty mechanism of the protocol specification.
As I see it AS only provides the toolbox of basic primitives to build from, but it is the extension mechanism where real domains are modeled for rich social networking use cases. We should go past the
everything-is-a-note-or-it-doesnt-work-with-mastodon
rule, but also not end up witheverything-is-either-note-or-article-or-you-are-on-your-own
.Had schema.org been incorporated somehow in AS/AP we might have avoided the "if we only have a hammer, everything looks like a nail". #software:nodebb and #software:discourse would be building interoperable
DiscussionForumPosting
and #software:wordpress would have support forBlogPosting
. Maybe #software:lemmy being a link aggregator may add support forReview
and its subtypes.The true power of the social web does not come from cramming society into a one-size-fits-all model, but instead to facilitate an ever growing support for interoperable (app-independent) domain designs. To move towards a heterogeneous social web. So imho..
Microblogging Forums crammed into microblogging Media publishing crammed into microblogging Forge federation means it microblogs
The direction to follow is for AS/AP to become more meaningful standards again wrt their promise of universal social networking. It means that we should step off of the Mastodon-first approach, and turning any social app into a microblogging app to then extend. The standards movement must find its own healthy path again.
AS/AP has huge flexibility. It's extension mechanism should be its strength, not its weakness. We must focus more on it, or AS/AP is doomed in protocol decay hell. It is meaningless if a project says "We are now fediverse-enabled". It is like saying "we have an internet connection".
Instead a project should say "We added Fediverse Microblogging support" or "Task management" support, or "Code reviewing" (better than the not-so-clear "forge federation" umbrella term).
Update copied from this toot:
The versatility of the Linked Data standards that AS/AP is based on is such that specific data models for any social networking use case can be defined.
While LD is suited as storage format for the social / knowledge graph, it is not out-of-the-box a good fit for the AS/AP extension mechanism. Using closed-world models based on strategic design (part of domain-driven design) would be best to define the interoperable msg exchange patterns that occur between actors.
-
Yes, that might work. Note having an agreed upon and well-defined constraint to its formatting, and a more general-purpose Article lacking that constraint.
For both types we should beware how base type definitions may or may not affect extended types. I think it not well defined what happens in terms of behavior for extended and/or multi-types (think e.g.
"type": ["as:Note", "my:StickyNote"]
).Golden Hammer: Article versus Note?
This section is a bit of an aside, not directly related to the FEP. Yet relates to the overall design direction of AS/AP and how that may cause new trouble in the future as things are adopted by the installed base.
Wikipedia quote: "[Golden hammer is] the comfort zone state where you don't change anything to avoid risk. The problem with using the same tools every time you can is that you don't have enough arguments to make a choice because you have nothing to compare to and is limiting your knowledge."
For
Article
schema.org suggests the following meaningful sub-types.- Article
And I copied that whole list because in AS/AP land by default the urge is to try to hammer every information model in the limited set of objects and activities that ActivityStreams provides, and not touch the whole extensibilty mechanism of the protocol specification.
As I see it AS only provides the toolbox of basic primitives to build from, but it is the extension mechanism where real domains are modeled for rich social networking use cases. We should go past the
everything-is-a-note-or-it-doesnt-work-with-mastodon
rule, but also not end up witheverything-is-either-note-or-article-or-you-are-on-your-own
.Had schema.org been incorporated somehow in AS/AP we might have avoided the "if we only have a hammer, everything looks like a nail". #software:nodebb and #software:discourse would be building interoperable
DiscussionForumPosting
and #software:wordpress would have support forBlogPosting
. Maybe #software:lemmy being a link aggregator may add support forReview
and its subtypes.The true power of the social web does not come from cramming society into a one-size-fits-all model, but instead to facilitate an ever growing support for interoperable (app-independent) domain designs. To move towards a heterogeneous social web. So imho..
Microblogging Forums crammed into microblogging Media publishing crammed into microblogging Forge federation means it microblogs
The direction to follow is for AS/AP to become more meaningful standards again wrt their promise of universal social networking. It means that we should step off of the Mastodon-first approach, and turning any social app into a microblogging app to then extend. The standards movement must find its own healthy path again.
AS/AP has huge flexibility. It's extension mechanism should be its strength, not its weakness. We must focus more on it, or AS/AP is doomed in protocol decay hell. It is meaningless if a project says "We are now fediverse-enabled". It is like saying "we have an internet connection".
Instead a project should say "We added Fediverse Microblogging support" or "Task management" support, or "Code reviewing" (better than the not-so-clear "forge federation" umbrella term).
Update copied from this toot:
The versatility of the Linked Data standards that AS/AP is based on is such that specific data models for any social networking use case can be defined.
While LD is suited as storage format for the social / knowledge graph, it is not out-of-the-box a good fit for the AS/AP extension mechanism. Using closed-world models based on strategic design (part of domain-driven design) would be best to define the interoperable msg exchange patterns that occur between actors.
aschrijver:in AS/AP land by default the urge is to try to hammer every information model in the limited set of objects and activities that ActivityStreams provides, and not touch the whole extensibilty mechanism
This is part of the problem but the other problem is trying to hammer it at all. It should fit perfectly without any hammering.
Set theory and logical inferencing
One of the core things about information modeling is that when trying to logically reason about things, you have to realize that those things can belong to multiple sets/classes. Some of those sets may be subsets or supersets. For example, you can use Schema Dot Org to say that all BlogPostings are Articles, but not all Articles are BlogPostings. Using logic, we can infer statements that were never explicitly made, but are still accepted to be true.
For the application developer who wants to handle all manner of Articles, they might encounter something which is an Article but doesn't say that it is an Article. If there is a statement like
@type: http://schema.org/BlogPosting
, you can use an RDF Schema or OWL ontology (not to be confused with Schema Dot Org) to fully infer the following knowledge:PREFIX schema: PREFIX rdfs: # Given the following statement...<#something> a schema:BlogPosting.# And combining it with the following class relations...schema:BlogPosting rdfs:subClassOf schema:SocialMediaPosting.schema:SocialMediaPosting rdfs:subClassOf schema:Article.# We can infer the following without being explicitly told it:<#something> a schema:SocialMediaPosting.<#something> a schema:Article.schema:BlogPosting rdfs:subClassOf schema:Article.
If you're not evaluating this kind of inferencing as runtime code, you're evaluating it in your head and pre-baking it into your application based on your foreknowledge.
We can also infer classes/types based on the domains and ranges of certain properties. Instead of claiming "All Articles have an articleBody", we can instead claim "If it has articleBody, it's an Article". Here, we are saying that the domain of "articleBody" is "Article":
schema:articleBody rdfs:domain schema:Article.
Now, we can similarly infer that something is an Article simply by virtue of it having an articleBody, like duck typing:
# Given the following statement:<#something> schema:articleBody "test post please ignore".# Combined with the following RDF Schema information:schema:articleBody rdfs:domain schema:Article.# We can infer this without being explicitly told it:<#something> a schema:Article.
However, we lack enough information to say that it is specifically a BlogPosting. There are no properties that have a domain of BlogPosting. If we want people to know that it's specifically a BlogPosting, then we need to explicitly declare this. But we shouldn't need to explicitly declare every single superclass for compatibility reasons! It would be incorrect to assume that all Articles must be explicitly declared as Articles. In much the same way, we should be able to know that every as:Object is an as:Object without explicitly stating that it has
@type: https://www.w3.org/ns/activitystreams#Object
. Imagine if you came across an Image that looked like this:{ "@context": "https://www.w3.org/ns/activitystreams", "type": ["Object", "Document", "Image"]}
This is not false, but it is unnecessarily verbose if you already know that all Images are Documents and that all Documents are Objects:
PREFIX as: PREFIX rdfs: as:Image rdfs:subClassOf as:Document.as:Document rdfs:subClassOf as:Object.
Usually, this knowledge is pre-baked into applications because people who read the spec are told that these class relations exist, so if they write good code, then that code should be able to recognize that an Image is also a Document. (If only it were that easy to write good code!)
Information modeling and applying multiple models
So when it comes to modeling what exactly something is, I think that we should find commonalities between different classes of things and distill those into specific properties. ActivityStreams has a decent model for modeling Activity as "something that happened", but it's all the other parts that aren't as coherently modeled -- the Note/Article/Document/Link/Collection stuff could use more thought.
If the aim is to model communities of people online who have discussions, then something like SIOC provides a better content model. Imagine if we didn't have to worry about what was a Note and what was an Article, and we just called it a Post. Does the distinction between a Note and an Article matter? It's highly debatable. Even staying within ActivityStreams, you could consider Note and Article to both be subclasses of some kind of ContentfulObject class, where the domain of as:content is ContentfulObject. (Currently, the domain of as:content is as:Object, which is very broad.)
Put another way, the information model should match the application domain. Mastodon has a mostly microblogging-based approach, but its "statuses" actually fit multiple information models. We don't have to limit ourselves to fitting into one box!
Imagine a Mastodon API response for a Status. Why not expose this in the exact same way that the AS2 response is exposed? We can apply multiple profiles to the same resource. Here, we can use properties from AS2, Mastodon, and SIOC in equal measure. Let's take this abbreviated API response from Mastodon:
{ "uri": "https://mastodon.social/users/trwnh/statuses/114618667090037785", "url": "https://mastodon.social/@trwnh/114618667090037785", "id": "114618667090037785", "created_at": "2025-06-03T09:14:23.752Z", "sensitive": false, "content": "
test post please ignore
", "account": { "uri": "https://mastodon.social/users/trwnh", "url": "https://mastodon.social/@trwnh", "id": "14715" }}Now, let's give it a basic JSON-LD context to turn it into Linked Data. Either we can use
@base
with theid
property...{ "@context": { "@vocab": "http://joinmastodon.org/ns/api#", "@base": "https://mastodon.social/api/v1/statuses/", "id": "@id" }, "uri": "https://mastodon.social/users/trwnh/statuses/114618667090037785", "url": "https://mastodon.social/@trwnh/114618667090037785", "id": "114618667090037785", // expands with @base "created_at": "2025-06-03T09:14:23.752Z", "sensitive": false, "content": "
test post please ignore
", "account": { "@context": { "@base": "https://mastodon.social/api/v1/accounts/" // override the @base }, "uri": "https://mastodon.social/users/trwnh", "url": "https://mastodon.social/@trwnh", "id": "14715" // expands against more recently defined @base }}Or we can use
uri
directly instead...{ "@context": { "@vocab": "http://joinmastodon.org/ns/api#", "uri": "@id" }, "uri": "https://mastodon.social/users/trwnh/statuses/114618667090037785", // our @id "url": "https://mastodon.social/@trwnh/114618667090037785", "id": "114618667090037785", "created_at": "2025-06-03T09:14:23.752Z", "sensitive": false, "content": "
test post please ignore
", "account": { "uri": "https://mastodon.social/users/trwnh", // our @id "url": "https://mastodon.social/@trwnh", "id": "14715" }}We can also make statements to let us infer things:
PREFIX as: PREFIX sioc: PREFIX dcterms: PREFIX rdfs: PREFIX owl: PREFIX : # Equivalences between Mastodon API and ActivityStreams:url owl:equivalentProperty as:url.:created_at owl:equivalentProperty as:published.:sensitive owl:equivalentProperty as:sensitive.:content owl:equivalentProperty as:content.:account rdfs:subPropertyOf as:attributedTo.# Equivalences between Mastodon API and SIOC:created_at rdfs:subClassOf dcterms:created.:content rdfs:subPropertyOf sioc:content.:account rdfs:subPropertyOf sioc:has_creator.
Note that the difference between "subproperty" and "equivalent property" is that:
- if A is a "subproperty of" B, then values of A are also values of B, but values of B are not necessarily values of A. Seeing a Mastodon API url lets you infer an as:url, but seeing an as:url doesn't let you infer a Mastodon API url.
- if A is an "equivalent property" to B, then values of A are also values of B, and values of B are also values of A. Seeing a Mastodon API url lets you infer an as:url, and likewise seeing an as:url lets you infer a Mastodon API url.
(Basically, I am saying above that seeing as:attributedTo or sioc:has_creator does not immediately imply that it is a Mastodon account.)
Using our inferencing abilities, we can present this same Status in three different ways:
{ "@context": { "@vocab": "http://joinmastodon.org/ns/api#" }, "content": "
test post please ignore
", // ...}{ "@context": { "@vocab": "https://www.w3.org/ns/activitystreams#" }, "content": "
test post please ignore
", // ...}{ "@context": { "@vocab": "http://rdfs.org/sioc/ns#" }, "content": "
test post please ignore
", // ...}Or we can present this Status in a single combined resource. The challenge is in negotiating with the requester which specific representation or profile they wish to consume. But in a generic JSON-LD sense, this would work:
{ "https://www.w3.org/ns/activitystreams#content": "
test post please ignore
", "http://rdfs.org/sioc/ns#content": "test post please ignore
", "http://joinmastodon.org/ns/api#content": "test post please ignore
"}Yes, this is duplicating the information, but we have tools to prevent that, like inferencing and content negotation. The duplication arises because most/all consumers do not do any inferencing at all. This means that the publisher needs to do the consumer's inferencing for them, ahead-of-time... or otherwise define something like https://w3c.github.io/dx-connegp/connegp/
Tying it back to the FEP and "longform text" as an application domain
Ultimately, I don't think the divide between Article and Note should be given as much prominence as it currently is. If the difference actually matters, then they should have different information models. For example, if "longform text" meant something more like "the content is split into one or more sections", then this should be reflected in the information model, not stuffed into a single
content
property that equally serves both "shortform" and "longform" applications.In other words, it's a bad idea to interpret a property differently depending on which type is declared. This whole mess started because Mastodon discriminated against Article resources by not rendering the content directly, which some people don't like. I would argue that the root of this decision is that there is some indelible difference between "Note content" and "Article content" that is not being captured, insofar as you accept "Note" to mean "shortform" and "Article" to mean "longform".
The other thing at play here is that HTML is mostly unstructured data, or rather, the structure of an "article" is arbitrary. The
tag can contain basically anything. If you want something more structured, then you should actually implement that structure instead of just dumping HTML into content
.Take a look at something like https://csarven.ca/linked-data-notifications when parsed as RDF sometime, and you'll see some really interesting things:
- The resource is declared to be a
bibo:Document, sioc:Post, schema:ScholarlyArticle, prov:Entity, foaf:Document, as:Article
in equal measure. How many fediverse applications do you think would be able to recognize that this is an Article and properly handle it as such? - The resource declares that it
schema:hasPart
of an RDF List(:introduction, :related-work, :requirements-and-design-considerations, :protocol, :implementations, :analysis-and-evaluation, :conclusions, :acknowledgements)
. Each of these sections is independently addressable and described.- The HTML content of each section is available as the
schema:description
, and subsections are likewise made available viaschema:hasPart
again.
- The HTML content of each section is available as the
- Comments are also included and similarly extensively described in multiple vocabularies -- AS2, SIOC, Schema Dot Org, Web Annotations, as appropriate.
So how much of this can we say is necessary for "longform text"? Granted, the level of detail in this scholarly article is probably far beyond what most people care to describe for a personal blog or social media account. But it's worth considering which parts make up which application domains, and therefore which information models should include which parts.
Maybe the baseline needs to change such that an "Article" is no longer just a blob of HTML content, indistinguishable from a "Note" except by what the publisher chooses to declare. If that structure is necessary, then it should be accounted for.
Or maybe it's fine that an Article is just a stub converted from the name and summary and url. Maybe we care more about the metadata than the actual content, like how
as:inReplyTo
orsioc:reply_of
/sioc:has_reply
are used to indicate that one thing is a response to another thing, orsioc:has_container
can be used to link a Post to a Thread or Forum.I still think that some of the things in this FEP are doubling down on the problem rather than making it better. Mainly, I still have concerns about the proposed use of
preview
to essentially serve as an "alternate" instead, and I am concerned that thepreview
being a different object will open the door to people replying to the preview when they meant to reply to the article. I also have more general concerns about the specificity of AS2-Vocab and its content model, but that's broader than just this FEP, and I don't really have a better answer at this time. I am worried that publishing AS2 documents is going to become a highly idiosyncratic thing where you have to deal with so many "fediverse" consumer quirks that you can't express yourself as you intended. By dint of having only onecontent
property, you are already stuck in lowest-common-denominator form. In that regard, I'm not sure how much this FEP actually "matters" for "long-form text", since many of the recommendations it makes regarding properties are recommendations that apply more generally to things that aren't "long-form text" also. It might be worth considering how much of this "long-form text" stuff would overlap with a more general/broad "social media" FEP. But again, I don't think AS2-Vocab or AP are equipped to make this distinction... nor am I sure how much it makes sense to make this distinction. -
Hello!
This is a discussion thread for the proposed FEP-b2b8: Long-form Text.Please use this thread to discuss the proposed FEP and any potential problemsor improvements that can be addressed.
SummaryMulti-paragraph text is an important content type on the Social Web. This FEP defines best practices for representing and using properties of a long-form text object in Activity Streams 2.0.
cc @eprodrom
Another distinction is that Articles have a different role in conversations. It is typically a long form blog post or journalistic article that people may or may not be able to comment on. It is something people talk about. There is a clear distinction between the blog post and the comments.
Note just seems like a short version of Article. Someone posted something and you may or may not be able to reply to it. (That is probably why there is confusion over what a note versus article is.)
Note makes sense for Twitter-style platforms since it is primarily a broadcast style platform. You say something and it gets broadcast to your followers. It does not support threaded conversations, so every reply in considered a new top-level post.
But this does not make sense for conversational platforms like forums and discussion groups, which are threaded conversations, not broadcast-style social media.
We currently have no type for conversations. I do not consider a forum post or comment a Note or an Article. The fact that some platforms use Article for forum posts seems odd, since they do not resemble articles in any way, shape, or form. The only thing that have in common with an Article is that there is a top level post and comments. And Note does not seem appropriate either for similar reasons.
If we are not going to create clear distinctions between types, or just misuse them, then we might as well just have one type, call it Stuff, and be done with it.
-
Hello!
This is a discussion thread for the proposed FEP-b2b8: Long-form Text.Please use this thread to discuss the proposed FEP and any potential problemsor improvements that can be addressed.
SummaryMulti-paragraph text is an important content type on the Social Web. This FEP defines best practices for representing and using properties of a long-form text object in Activity Streams 2.0.
cc @eprodrom
At first I was opposed to how Mastodon handles Articles, but if you define an article the traditional way (journalist article, blog post, etc.) then it makes perfect sense. If a platform cannot display the HTML in the article properly, it SHOULD link to it instead of try to display it.
And the current trend to mark things that are clearly not Articles as Articles, and then stuffing the Summary with the body of the post so it shows up on Mastodon is... a workaround that should not have happened. A Summary should not contain HTML and should remain short. After all, it is a summary, not the body of a post. (And some platforms strip all of your fancy HTML in the summary field anyway.)
All of this because there is no type for Conversation, and even if there was, some platforms would not recognize it.
I think the only solution is to use the same model email has. They send a text version and an HTML version, and the client picks what it wants to display. But in our case, we send:
- A Link to the resource.
- A Summary of the resource (short, no formatting, may contain content warnings).
- A Note: A Simplified View (a version of the content with limited formatting).
- An Article: A Fully Formatted View (a version that contains HTML, which will be sanitized upon display).
If a platform does not support articles, or wishes to link to articles instead of display them, then it can use the content of the Summary or Note plus the Link to display something meaningful. Platforms that support articles can use the Summary or Note content on some displays (like the inbox or recent posts view) and the Article content on the single post page with its comments, if any.
Feel free to rename Note and Article in the example above in future specifications. I am just using terms people are familiar with.
But we need to somehow get away from everything being a note, and the misuse of article for things that are clearly not articles.
-
@julian What is the purpose of
preview
if Mastodon can already rendersummary
?silverpill@mitra.social missed your reply.
I'm not here to decide what's right or wrong, just going with consensus. In any case, a dedicated
preview
would allow implementors to opt in to an alternative representation that better respects the constraints supplied by Mastodon and other microblog-focused software. Things like lack of support for inline images, and the use ofattachment
.summary
gets you part of the way there, but Mastodon would still strip out the inline images, and I don't want to add image assets toArticle
inattachment
because I want to promote the support for inline images for non-Note
s. -
silverpill@mitra.social specifically, though, the idea of providing a
rel="alternate"
would be more appropriate than usingpreview
. (cc trwnh@mastodon.social)What that ends up looking like is to be determined, but I am optimistic.
-
At first I was opposed to how Mastodon handles Articles, but if you define an article the traditional way (journalist article, blog post, etc.) then it makes perfect sense. If a platform cannot display the HTML in the article properly, it SHOULD link to it instead of try to display it.
And the current trend to mark things that are clearly not Articles as Articles, and then stuffing the Summary with the body of the post so it shows up on Mastodon is... a workaround that should not have happened. A Summary should not contain HTML and should remain short. After all, it is a summary, not the body of a post. (And some platforms strip all of your fancy HTML in the summary field anyway.)
All of this because there is no type for Conversation, and even if there was, some platforms would not recognize it.
I think the only solution is to use the same model email has. They send a text version and an HTML version, and the client picks what it wants to display. But in our case, we send:
- A Link to the resource.
- A Summary of the resource (short, no formatting, may contain content warnings).
- A Note: A Simplified View (a version of the content with limited formatting).
- An Article: A Fully Formatted View (a version that contains HTML, which will be sanitized upon display).
If a platform does not support articles, or wishes to link to articles instead of display them, then it can use the content of the Summary or Note plus the Link to display something meaningful. Platforms that support articles can use the Summary or Note content on some displays (like the inbox or recent posts view) and the Article content on the single post page with its comments, if any.
Feel free to rename Note and Article in the example above in future specifications. I am just using terms people are familiar with.
But we need to somehow get away from everything being a note, and the misuse of article for things that are clearly not articles.
wistex@socialhub.activitypub.rocks why a link when you can set
url
?