I need to save and index some discussion from linkedin and facebook. But also, sometimes, discussions forums, such as this one.
By their nature, discussions evolve: for this will be not convenient to use a static format like PDF, because I should update every time a new comment appears.
In many cases also the access is restricted (with login) then a bookmark is not convenient, because depending on whether I did or did not log I would see different content!
What to do? Html or web archive? I thought web archive, to prevent these problems but…
I have a thread on a group linkedin saved as a web archive and, despite I refresh and/or update web archive, it continues to show me the same page when I had saved the page: all subsequent discussions do not appear. Update does not update. If I refresh i see new contents but they don’t save!
RSS is usually the solution. Depends on whether you can subscribe to the forum – check with the provider. One way get what you want is using IFTTT (If This Then That – a free (for now) service) to create a recipe that monitors a feed and creates a new document in Box or Dropbox (or sends an email, etc.) when the feed changes. Then index that Box or Dropbox folder.
I don’t know if I can use RSS on facebook and linkedin… don’t think so.
However I have a linkedin group thread on a group. It’s posts are the same of the time I saved the page into DT as a web archive.
If I do “update captured archive”, the new posts does not appear. Why?
Use Reload in the contextual menu to see the current state of the target web page. This will not preserve that state, however because reload does not modify the stored archive file. To make copy of the current state use Capture Page > Web Archive or some other Capture option. An alternative is to bookmark the page and take periodic archives of that page. None of these approaches create indexed files. Not sure why indexing is important, but you could always export the new archives and re-index them.
IFTTT has widgets that capture data from Facebook and LinkedIn. I’m not trying to sell it, it’s just an option.
I reloaded and then i captured a new archive: this produced a new file, with all new contents, and this is ok.
However, I don’t understand… I think this should be the “update captured archive” function, no?
No problem with IFTTT I only need nothing more than save a page sometimes and sometimes refresh its content.
“Update captured archive” is to resave over the current archive. As korm pointed out, you need to refresh the page first, then update the file.