i use these agents on huginn:
C[Time trigger]-->RA[RSS Agents]-->ST
RA-->G
G[Grabbers website agents]-->ST
ST[show title text]-->FT
FT[RSS Output Agents]
There's no secret besides the feed-fetching token. Example with satwcomic.com/latest:
https://satwcomic.com/food-crimes
, grabs img[itemprop]
as main content
, and [itemprop="articleBody"]
as extra textimg
has size, but G's output does not)The transformations I want are:
img[title]
, copy the title out as sibling p
img
, remove srcset,height,width attr, remove ComicPress resizing parameters in src attralt
and title
Inoreader remembers items per feed, i hope. It'd be nice to short items
by some date, before letting inoreader fetch them.
Huginn stores agent settings & memory in PG. Some useful data we can get out is:
sort=last_event_at.asc
), to detect idle sourcesrsspls and news-flash crates are possible idea sources.
we're using miniserve
to keep the VM flying, but data is generated
before build time with cargo run --bin oneshot
, using one-time
fetched XML files.