Algorithmic curation

{{Short description|Algorithmic selection of online media}} [[File:Mastodon mobile web screenshot.png|thumb|A feed of posts curated for a user on the Mastodon social network]]

'''Algorithm curation''' is the selection of online media by technologies such as recommender systems and personalized search. Curation entails the selective sharing of online content and recommendations based on inferred interests.<ref name=":0">{{Citation |last1=Khan |first1=Sadia |title=Curation |date=2018 |encyclopedia=The International Encyclopedia of Media Literacy |pages=1–9 |url=https://onlinelibrary.wiley.com/doi/abs/10.1002/9781118978238.ieml0047 |access-date=2025-11-21 |publisher=John Wiley & Sons, Ltd |language=en |doi=10.1002/9781118978238.ieml0047 |isbn=978-1-118-97823-8 |last2=Bhatt |first2=Ibrar}}</ref> Curation algorithms implement different filter approaches, such as collaborative filtering and content-based filtering. Examples include search engine and social media products such as the Twitter feed, Facebook's News Feed, and Google Personalized Search.<ref name=":1">{{Cite journal |last1=Berman |first1=Ron |last2=Katona |first2=Zsolt |date=Sep 2016 |title=The Impact of Curation Algorithms on Social Network Content Quality and Structure |url=https://ideas.repec.org//p/net/wpaper/1608.html |journal=Working Papers |language=en}}</ref>

== History ==

=== Early algorithmic curation === Online platforms use newsfeed algorithms to determine what content to present to each user.<ref name=":15">{{Cite journal |last1=Gausen |first1=Anna |last2=Luk |first2=Wayne |last3=Guo |first3=Ce |date=2022-12-28 |title=Using Agent-Based Modelling to Evaluate the Impact of Algorithmic Curation on Social Media |url=https://doi.org/10.1145/3546915 |journal=J. Data and Information Quality |volume=15 |issue=1 |pages=2:1–2:24 |doi=10.1145/3546915 |issn=1936-1955|url-access=subscription }}</ref><ref name=":16">{{Cite journal |last1=Bandy |first1=Jack |last2=Diakopoulos |first2=Nicholas |date=2021-04-22 |title=More Accounts, Fewer Links: How Algorithmic Curation Impacts Media Exposure in Twitter Timelines |url=https://dl.acm.org/doi/10.1145/3449152 |journal=Proc. ACM Hum.-Comput. Interact. |volume=5 |issue=CSCW1 |pages=78:1–78:28 |doi=10.1145/3449152|url-access=subscription }}</ref> The volume of content published on social media platforms created a need for automated filtering, as manual review of all available content by users is not feasible.<ref name=":15" /><ref name=":16" /> These systems function as a form of gatekeeper, shaping which new material users are exposed to and influencing knowledge, attention, and political exposure.<ref name=":16" />

==== Information overload ==== Early ranking algorithms addressed information overload by surfacing the most recent or most popular posts.<ref name=":15" /> Later systems shifted toward ranking content based on predicted engagement, aiming to increase the time users spend on a platform.<ref name=":15" /> Research has found that these engagement-oriented systems can increase the spread of misinformation and contribute to political polarization as a side effect of optimising for user interaction.<ref name=":16" />

==== How algorithm changes users' feeds over time ==== Algorithmic curation has been found to increase source diversity in some respects while simultaneously reducing the number of external links presented to users, which limits exposure to off-platform content.<ref name=":16" /> Research using agent-based modelling has examined how user behaviour, information quality, and algorithmic design interact with one another over time.<ref name=":15" /><ref name=":16" />

=== Emergence of AI === Platforms increasingly shifted from rule-based ranking systems toward machine-learning and AI-driven approaches, which allow feeds to be personalised at a larger scale and with greater responsiveness to user behaviour.<ref name=":15" /><ref name=":16" /> For example, X (formerly Twitter) moved away from a chronological feed toward an AI-powered ranking system that personalises content for each user.<ref name=":16" /> These systems are capable of making ranking decisions across volumes of content and user interactions that would not be practical to handle manually.<ref name=":16" />

== Approach ==

=== Filter types ===

==== Collaborative filtering ==== Collaborative filtering (CF) methods create recommendations based on a person's usage patterns.<ref name=":2">{{Cite book |last=Herlocker |first=Jonathan |title=Proceedings of the 2000 ACM conference on Computer supported cooperative work |chapter=Explaining Collaborative Filtering Recommendations |date=2000 |pages=241–250 |doi=10.1145/358916.358995 |isbn=1-58113-222-0 |chapter-url=https://dl.acm.org/doi/pdf/10.1145/358916.358995}}</ref> CF predicts a person's preference for an item by matching their interests with those of users who have similar interests.<ref name=":2" /> This process allows for the sharing of ratings between users with similar profiles.<ref name=":2" /> CF is based on patterns of human behaviour rather than machine analysis of content itself.<ref name=":2" /> Users of CF systems rate items they have interacted with, and these ratings form a profile of interests.<ref name=":2" /> The CF system then matches that user with others who have similar profiles, and uses their ratings to generate recommendations.<ref name=":2" /> Collaborative filtering can be applied across various content types including text, images, music, and financial products, and can account for complex attributes such as taste and quality that are difficult to represent explicitly.<ref name=":3">{{Cite web |title=Online Recommender Systems – How Does a Website Know What I Want? {{!}} |url=https://blogs.ams.org/mathgradblog/2015/05/25/online-recommender-systems-website-want/ |access-date=2025-11-21 |language=en-US}}</ref>

==== Content-based filtering ==== Content-based filtering (CBF) builds a user profile to represent the types of items a user has engaged with, based on keywords and attributes used to describe those items.<ref name=":3" /><ref name=":4">{{Cite journal |last1=Wang |first1=Donghui |last2=Liang |first2=Yanchun |last3=Xu |first3=Dong |last4=Feng |first4=Xiaoyue |last5=Guan |first5=Renchu |date=2018-10-01 |title=A content-based recommender system for computer science publications |journal=Knowledge-Based Systems |volume=157 |pages=1–9 |doi=10.1016/j.knosys.2018.05.001 |issn=0950-7051|doi-access=free }}</ref> Recommendations are generated by presenting items similar to those the user has previously engaged with or is currently viewing.<ref name=":4" /> The CBF method creates a profile for each item based on discrete attributes and features, and then constructs a content-based user profile using a weighted vector of those features derived from items the user has rated, purchased, or interacted with.<ref name=":3" /><ref name=":4" /> The weights represent the relative importance of each feature, and can be computed using techniques such as Bayesian classifiers, cluster analysis, decision trees, and artificial neural networks, with the goal of estimating the probability that a user will engage with a suggested item.<ref name=":3" /> One application of content-based filtering is Pandora Radio, where users provide an artist, genre, or composer to generate a station, and the system surfaces music with similar attributes.<ref name=":3" />

== Technology ==

=== Recommender system === Recommender systems rank and suggest content to users based on a combination of implicit and explicit user input.<ref name=":5">{{Cite journal |last1=Roy |first1=Deepjyoti |last2=Dutta |first2=Mala |date=2022-05-03 |title=A systematic review and research perspective on recommender systems |journal=Journal of Big Data |volume=9 |issue=1 |pages=59 |doi=10.1186/s40537-022-00592-5 |doi-access=free |issn=2196-1115}}</ref> Implicit signals include time spent viewing or engaging with a specific item.<ref name=":5" /> Explicit signals include actions such as liking posts, saving store pages, reading news articles, or sharing content.<ref name=":5" />

=== Personalized search === Personalized search aims to retrieve results most relevant to the user by incorporating contextual factors beyond the explicit query, such as past queries, browsing history, and inferred interests.<ref>{{Cite book |last1=Dou |first1=Zhicheng |last2=Song |first2=Ruihua |last3=Wen |first3=Ji-Rong |chapter=A large-scale evaluation and analysis of personalized search strategies |date=2007-05-08 |title=Proceedings of the 16th international conference on World Wide Web |chapter-url=https://dl.acm.org/doi/10.1145/1242572.1242651 |language=en |location=Banff Alberta Canada |publisher=ACM |pages=581–590 |doi=10.1145/1242572.1242651 |isbn=978-1-59593-654-7|chapter-url-access=subscription }}</ref> Social media platforms such as X (formerly Twitter) and Bluesky generate recommendations based on similar users and the content those users interact with.<ref name=":6">{{Cite journal |last1=Liu |first1=Yuhan |last2=Song |first2=Emmy |last3=Zhang |first3=Owen Xingjian |last4=Merriman |first4=Jewel |last5=Zhang |first5=Lei |last6=Monroy-Hernández |first6=Andrés |date=2025-10-16 |title=Understanding Decentralized Social Feed Curation on Mastodon |journal=Proc. ACM Hum.-Comput. Interact. |volume=9 |issue=7 |pages=CSCW507:1–CSCW507:25 |doi=10.1145/3757688|doi-access=free }}</ref> Personalized search may also allow users to explicitly filter results by blocking content containing certain phrases or hashtags.<ref name=":7">{{Cite journal |last1=Quelle |first1=Dorian |last2=Bovet |first2=Alexandre |date=2025-02-26 |title=Bluesky: Network topology, polarization, and algorithmic curation |journal=PLOS ONE |language=en |volume=20 |issue=2 |article-number=e0318034 |doi=10.1371/journal.pone.0318034 |doi-access=free |pmid=40009593 |arxiv=2405.17571 |bibcode=2025PLoSO..2018034Q |issn=1932-6203}}</ref> For first-time users without prior history, personalized search may draw on content-based filtering to establish an initial context.<ref name=":3" /> Similar processes are used by search engines and retail platforms to tailor results and product recommendations to individual users.

== AI contribution == Artificial intelligence contributes to algorithmic curation through machine-learning models capable of processing large volumes of data.<ref name=":10">{{Cite journal |last1=Lazer |first1=David |last2=Swire-Thompson |first2=Briony |last3=Wilson |first3=Christo |date=2024-09-01 |title=A Normative Framework for Assessing the Information Curation Algorithms of the Internet |url=https://doi.org/10.1177/17456916231186779 |journal=Perspectives on Psychological Science |language=EN |volume=19 |issue=5 |pages=749–757 |doi=10.1177/17456916231186779 |pmid=38010888 |issn=1745-6916|url-access=subscription }}</ref> Techniques such as deep learning and reinforcement learning allow curation algorithms to model user preferences with greater granularity alongside established filtering approaches.<ref name=":10" /> This enables platforms to adjust content rankings rapidly in response to user behaviour.<ref name=":10" /> In social media and streaming contexts, AI-driven systems arrange feeds according to predicted relevance, with the outputs shaped by patterns present in the training data.<ref>{{Cite book |last1=Villermet |first1=Quentin |last2=Poiroux |first2=Jérémie |last3=Moussallam |first3=Manuel |last4=Louail |first4=Thomas |last5=Roth |first5=Camille |chapter=Follow the guides: Disentangling human and algorithmic curation in online music consumption |date=2021-09-13 |title=Fifteenth ACM Conference on Recommender Systems |chapter-url=https://doi.org/10.1145/3460231.3474269 |series=RecSys '21 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=380–389 |doi=10.1145/3460231.3474269 |isbn=978-1-4503-8458-2|arxiv=2109.03915 }}</ref>

== Social media and potential impact ==

=== Echo chambers === Social media algorithms, such as those used by X (formerly Twitter), recommend content that the system predicts a user will engage with positively. Content from accounts with differing perspectives is less likely to be surfaced, which may reduce source and topic diversity and contribute to the formation of echo chambers.<ref name=":16" /> For example, Facebook's news feed is designed to surface content aligned with users' prior engagement, which may reinforce existing views.<ref name=":8">{{Cite journal |last1=Papa |first1=Venetia |last2=Photiadis |first2=Thomas |date=2021-12-15 |title=Algorithmic Curation and Users' Civic Attitudes: A Study on Facebook News Feed Results |journal=Information |language=en |volume=12 |issue=12 |pages=522 |doi=10.3390/info12120522 |doi-access=free |issn=2078-2489 |hdl=20.500.14279/32861 |hdl-access=free }}</ref> This dynamic may contribute to filter bubbles, in which users are seldom exposed to content outside their existing interests. Users may further narrow their feeds by actively blocking certain content or accounts.<ref name=":16" />

=== Over-representation === A pattern observed across social media platforms is the concentration of algorithmic visibility among a small subset of users. Content from the most active users, those with the largest followings, or those generating the most engagement tends to be surfaced more frequently, meaning a small number of accounts can account for a disproportionate share of what appears in other users' feeds.<ref name=":16" />

== References == {{reflist}}

* Category:Social media Category:Mass media monitoring Category:Social influence