[Dept. of Computer Science]

the news.answers WWW archive

Usenet newsgroup news.answers

The Usenet newsgroup news.answers is a repository for periodic informational postings (also called Frequently Asked Questions postings, or FAQs) from other newsgroups. It is a moderated newsgroup. If you want to post on news.answers, you must follow the guidelines. Instead of posting on news.answers you can use the rtfm-mail-server. See the file faq-server-help for details.

A big thanks to the news.answers moderation team <news-answers-request@mit.edu> who organize the whole thing. Without them this archive wouldn't exist and faq-readership would be lot more limited.

More info about faqs can by found in Finding and Writing FAQs and Periodic Postings and FAQ maintenance Aids .

Archive maintainers

The news.answers archive of the Computer Science Department, Utrecht University, the Netherlands is maintained by Henk Penning. Jaap Romers did the first version of the html-conversion and took care of the wais-indexing. I added the search-by-archive-name and search-by-newsgroup and maintain the news.answers ftp-archive on which the contents of the WWW archive is based.

We generate some statistics.

How it is done

The whole WWW news.answers archive is generated daily from scratch from an ftp archive of news.answers. The faqs are converted to html, leaving them textually intact, with one exception: each faq is preceeded by a small note in red explaining the status of the document (archived usenet posting) refering readers to the faq author(s) for matters concerning the content of the faq.
We attempt to convert text that looks like a http/ftp/gopher-url to something selectable, represented as the original text. The primary header is stripped to Subject:- and Newsgroups:-lines. The '*.answers' are stripped from the newsgroups, except from articles posted only in '*.answers'.
Generation of the html-faqs and -link-files is done by two small Perl programs and takes about 10 minutes.

The ftp archive is also updated daily from the Usenet spool tree.

Why it is done this way?

Since maintaining a WWW archive of news.answers is not exactly in my job description, if it can't be automated, I can't do it.

Leaving faqs textually intact is done because not all authors like a split-up of their faq. I can't determine (automagically) who does and who doesn't.

The note in red was added after much hesitation. I don't want to offend authors by sticking my words into their faq. However, from time to time, readers were confused as to the responsibilities of the author and those of the archivist. The note hopes to clear that up in a neutral way. It is meant to be like a stamp in a library book.

Turning text into selectable urls is done because it makes the html'ed faqs a lot more useful. The substitution can be done almost always automatically. However, if the url is embedded in url-like text the generated href is too long. Sometimes the href is too short because the url looks too much like surrounding text! I feel entitled to a few mistakes because the substitution doesn't change the textual representation of the author's text.

Stripping the primary header to Subject:- and Newsgroups:-lines is done because the header is so very messy. The header facilitates transmission of the faq on Usenet and access by news-readers. The link to the content (body) of the faq is sparse.
The '*.answers' newsgroups are stripped because they are too big to help in searching, except for articles posted only in those groups.
The subject:-line can be used as some sort of title and the Newsgroups:-line can be used to access the by-newsgroup hierarchy. The Summary:-line is left out because it is often repeated in the Subject or the first part of the faq.

The news.answers ftp archive

Even before the creation of news.answers members of the department have tried to keep up an ftp archive of faqs posted on Usenet. The introduction of the Usenet newsgroup news.answers made maintaining a proper faq archive look easy. Adding faqs to the archive was simply a matter of scanning /usr/spool/news/news/answers and copying files. However, as it turned out with news.answers, deleting faqs has become the main problem. Files become obsolete because faqs acquire a new archive-name, or worse, faqs simply die. In the archive, 10% of the faqs are more than three months old. Some of the most popular faqs (sex, puzzles) have not been posted for over to a year.

Other news.answers archives

Other news.answers archives on WWW are:
About the department: [top] [info] [index] [comments]
penning@cs.uu.nl Wed Jun 27 11:22:48 CEST 2007