[Comp.Sci.Dept, Utrecht] Note from archiver<at>cs.uu.nl: Since januari 2019, this archive is no longer maintained/updated.
This page is part of a big collection of Usenet postings, archived here for your convenience. For matters concerning the content of this page, please contact its author(s); use the source, if all else fails. For matters concerning the archive as a whole, please refer to the archive description or contact the archiver.

Subject: [comp.unix.bsd] NetBSD, FreeBSD, and OpenBSD FAQ (Part 3 of 10)

This article was archived around: 13 Oct 1997 02:00:11 -0500

All FAQs in Directory: 386bsd-faq
All FAQs posted in: comp.unix.bsd.netbsd.announce, comp.unix.bsd.freebsd.announce, comp.unix.openbsd.announce
Source: Usenet Version

Posted-By: auto-faq Archive-name: 386bsd-faq/part3
Section 2. (Common installation questions) 2.0 Install process Once the files are on floppies, thoughts usually turn to questions about how to install the boot image on a floppy. The rawrite program (for DOS) is used to write the bootable images (dist.fs and fixit.fs) onto floppies. The same image can used for 3 1/2 and 5 1/4 high density diskettes. NetBSD uses the .fs file extension for its floppy images. FreeBSD uses the .flp extension. Once the bootable images are written onto the floppies, insert the dist.fs disk into the A: drive and reboot. If the system does not boot, see section 2.5 below for more information. If the disk boots, type install and proceed to use the INSTALL.NOTES to get more information. Problems with the install are either related to hardware (i.e. Do you want to install on your T.V.?) or software. Of the hardware issues, the most common FAQs are usually straight out of the installation notes. Of the software issues, there are only two that really concern us. The first is bad files. On some systems, files that are loaded from floppy appear to 'go bad' when they arrive on the hard disk. Try some of these solutions: - You forgot binary. Don't get insulted. Those of us that FTP for a living forget sometimes. If so, the distribution will come out with all different sizes and install will complain about every disk. - One or two of the files are no good. Try getting them again. As a precaution, rename the bad files on your hard drive to names like foo.1 and bob.23. Copy the files again from floppy. If they are still bad, rename the file, and the one immediately before the first bad file (bin01.23 if bin01.24 is bad) and copy them again. If they are still bad, download those files again from the distribution site (including the one before and after the bad one) and try again. The reason for renaming the files is that sometimes, especially with drive that do not auto-magically record bad sectors, you could copy a distribution file onto a bad spot on the disk. If this happens, you want to isolate the bad spot. The easiest way to do that is just leave the bad file on it. For those of you that have received your system on a CD-ROM, you will need to find at least three things. One is this file. Since you are reading it, I assume that you got it already. :-) If you can't read this file (you got it from the newsgroup, for example) there is one thing that you need to know so you don't look like a complete idiot on the net. There is no such thing as a Unix CD-ROM. They are all in something called the ISO CD-ROM format. You can read them as the D: drive in DOS, or mount them on your Sun or SVR4 system or whatever. Second, you will need to find the directory with the bootable disk images in it. They will have self explanatory names like: <pre> kerncopy.fs base0-9.fs fred.fs genericaha.fs boot-me-first.fs this-is-the-file-with-the-fs-extension.fs </pre> You get the idea, right? Look for the MS-DOS program "RAWRITE.EXE". It should be right near the file system (.fs) files. Another clue for the truly lost will be that the file system files will all be 1.2 Meg big. These files will fit onto either a 1.2Meg 5.25 inch diskette, or a 1.44Meg 2.5 inch diskette. Use rawrite to write the fs files to diskettes and boot from the diskettes. The FreeBSD system uses a system 'pretty much' the same as this, except that the filesystem files are 1.2 Meg files and they all have a '.flp' extension. Other than that, the instructions apply. You did back your system up, right? For those of you trying to build installation floppies, you will need to verify the media type on the 'dd' and 'disklabel' commands in the Makefile. The default is to build to 1.2 Meg disks (being the smallest in terms of room). Change the 12100 and floppy5 entries to 14410 and floppy3 (respectivly). 2.0.1 Boot disks (versions and media formats) I have the base system installed, and now I want to install the rest of the system. Where did the 'extract' program go? When installing NetBSD, the 'set_tmp_dir' and 'extract' programs are part of the .profile that is booted when you are installing. This .profile is overwritten as part of the install process, and extract then disappears. If you need extract again, you can mount the install disk and source .profile. This will recreate these two routines. There is also an install procedure that FreeBSD uses that does the same job. It is defined as part of the .profile on one of the installation floppies. You can either copy it from there, or use the procedure for 'real disk partitioning'. Failing that, you can use the following process to extract the sources. - First, 'cd' to where your files are. - Assuming you want to extract the kernel sources, use the following command to extract the files: "cat ksrc* | tar -xvf - -C /" This will combine the pieces, feed them to tar, and load the files in the 'standard' place. The floppy booted, but now the hard disk won't boot? I am trying to reinstall. I run install and it loops asking me if I want to use the whole disk? The most likely culprit is your hard disk controller. If you have an IDE or EIDE controller, it is probably doing some type of disk translation for you. If this is the case (assume it is) then you will need to find out the real disk controller geometry, and rewrite your disk label. See section 2.6.2, but before doing that get the program pfdisk.exe. This program will tell you what the controller geometry is (right before it reboots your computer). Make the disklabel agree with this program and your system should boot. You may have to reinstall, but at least your disklabel will be right. Note that this is a nearly required step for all NetBSD and FreeBSD installs. You need to know what the disk geometry is before the BIOS messes with it. If you start having these kinds of installation problems, I can virtually assure you that it is because your controller geometry and your disklabel geometry are different. NOTE: If the hard disk controller does NOT have an option for turning off the geometry, you may well be completely out of luck. There are very few controllers that fall into this category. The ones that do full time translation will often boot up in translated mode. pfdisk will help you determine the correct geometry for your drive by telling you what the geometry looks like when 386bsd boots up. But on the other hand, maybe not... See section 2.5.5 below for a detailed set of instructions about getting NetBSD (and by implication 386BSD and FreeBSD) to work with a system that uses full time translation. What are the options on the boot prompt? The most amazing thing about the boot process in *BSD is the boot up alternatives that are available. There is little that a person can NOT do from the boot prompt. The boot diskette or disk can be selected (fd(1,a) for fd1a (my B: drive is DOS)) can be the source of my kernel. In addition, the name of the kernel can be chosen (this allows you to boot with a test kernel or reboot an older kernel if the new one gets hosed). Finally, there are three choices for options that may or may not work, depending on the age and proclivities of your boot blocks. These options are documented in 2.5.9 below. I just used the '-s' option on the boot, but I can't write anything onto the disk. What is wrong? If I use a plain 'mount' command it tells me that my root file system is read-only. In single-user (system booted with -s or an error in one of the processes started by /etc/rc) the root filesystem mounts as read-only by default. This was intended so that some range of problems would not be made worse by writes to the disk. The 'dos' partitions mount as read-only in that there are reservations as to how well some of the FreeBSD tools work with the pcfs. The same kind of reservations exist with NetBSD and the '-t msdosfs' option. These options (-r for read-only, -w for read-write) can be set in /etc/fstab. The status of both can be changed with 'mount -wu /{mount.dir}' (where {mount-dir} is the name of the directory that the offending partition is mounted) to read-write. Particularly for the dos filesystem, the man page for mount should be read in detail and the 'noexec' option examined. Note that mounting the file systems using the '-a' option will mount all of the file systems that are normally mounted with their usual read-write bits set normally. Using this option makes your root partition writable, and also mounts the rest of the partitions in your /etc/fstab that are normally mounted during boot-up. 2.1 Binary distribution 2.1.1 I want to install by NFS but I am having all kinds of problems connecting to the Sun server where the files are. There is an unusual problem when installing over NFS. This solution may have been corrected in the documentation that comes with FreeBSD and NetBSD, but if not, here it is. The most common problem seems to be that FreeBSD (and by inference NetBSD and all the other 4.4 based systems) do not send out NFS requests over privileged ports. Sun's NFS implementation (and others, once again by inference) expect precisely the opposite. These systems will quietly fail if you try to NFS to them. The usual error message (which may ONLY appear in /var/adm/messages) is "nfs_server: weak authentication, source IP address=xx.xx.xx.xx" SunOS is particularly insidious at this point. The mount succeeds, but then everything else after that fails. This means that your FreeBSD or NetBSD system will return an EACCESS error whenever you try to grab a file from the NFS filesystem. The solution (tested in FreeBSD) is to include the 'resvport' flag like this: <pre> # mount -o resvport server:/fs /mnt_point </pre> or to use the -P flag (which does the same thing). See the mount and mount_nfs man pages for the details. In fact, the -P flag provides a solution to the FreeBSD NFS installation problem. When prompted for server/filesystem, type in the flag before the server/filesystem pair: <pre> -P server:/fs </pre> If you are using an 8-bit network card, and want to avoid the ring buffer overflow problems that seem to come standard with this class of cards, you can also include the "-r4096 -w4096" flags between the -P and the server. 2.2 Configuration By far, the most common configuration questions are partitioning, followed closely by all of the other software in the system. Sendmail and named are also problems occasionally, but the documentation that comes with them usually gets you through. If you run into a problem, post a question to comp.os.386bsd.questions. A less frequently asked question is "Where can I get info on how to configure a kernel?" The answer to this question has been provided by Richard Murphey (Email address rich@Rice.edu). -------------------------------------------------------------------- Ready-to-print PostScript files for each section of the net2 system maintainer's manual are on nova.cc.purdue.edu in pub/386bsd/submissions/bsd.manuals. smm.02.config.ps.Z describes kernel configuration for the VAX, however some of it is relevant to 386BSD. There is no freely available rewrite for 386BSD that I know of. -------------------------------------------------------------------- Most of these manuals are now included in the standard release of NetBSD and FreeBSD in the /usr/share/doc directories. 2.2.1 Partitions This section describes many of the questions that people ask about hard disk partitioning. The first is a brief explanation of the BSD system disk partitions. What is a 'disklabel' and why do I need one? The BSD partition table supplements the DOS partition table. The entries in this table are meaningful to BSD. There are eight partitions in the BSD partition table, and they are normally lettered from a: to h:. This supplemental partition table is often referred to as the 'disklabel'. There have been many good articles in both the mailing lists and the newsgroups about disk labeling and partitioning. I have included a few of them here. NOTE: This information has not really changed since 386BSD 0.1. Some of the specifics may be out of date (the use of the d: partition, for example) but the steps and information are still pertinent. Phil Nelson (pail@cs.wu.edu) writes: I have installed several disks that have > 1024 cylinders and have used both DOS and NetBSD. What has worked for me EVERY TIME is the following: a) Tell the BIOS that you have 1023 cylinders and the correct geometry for heads and sectors. (This will limit your DOS part of the disk to be LESS than the first 1023 cylinders.) You need to have ALL of your partition A (/dev/wd?a) in the first 1023 cylinders so that the boot program can read the kernel from the root partition using the BIOS routines. (ed note: You can specify the full number of cylinders in some BIOSes and it won't make any difference. The DOS part of the disk will always be less than 1023 cylinders.) b) With fdisk, partition your 1023 cylinders as you want them. c) Use the real geometry in NetBSD. Once the NetBSD kernel is booted, it does not have the 1024 cylinder limit: that is only for the BIOS. NetBSD only looks at the BSD disklabel, not the DOS disk label. The two disk labels (DOS and BSD) may not agree on the BSD partition size! This isn't a problem, since each system's idea of the disks geometry is based on different information. d) Use NetBSD! Chris Jones writes: I was getting different reports of disk geometry from different programs, so I opened up the computer and read the plastic label on the drive. I then instructed the BIOS (which, when using auto-detect disagreed) what the disk geometry was. Then, I used pfdisk to create partitions. The first thing I did with it was to tell it what the geometry really was. It said something about a symbolic mapping and dealt with it. Then I was able to specify all partitions in real units instead of virtual ones. NetBSD boots fine, and if memory serves, it is the only program that has recognized the real disk geometry from the beginning. This tutorial is provided by by "Hacksaw" <hacksaw@user1.channel1.com> and provides an excellent overview of the hard disk partitioning procedure from start to finish. "Disk Partitioning for the Compleat Idiot" There are times, in our trials with our computers, that it becomes necessary to mess about with the disklabel. For those not knowledgeable of BSD or Unix Systems administration, this somewhat simple task can be somewhat daunting. This document is the result of my own short experience. This does not cover physical installation of the disk. For those who are having trouble with that, I direct you to any of the fine manuals dealing with hard drives and your hardware. It also does not deal with the vagaries of the DOS partition manager. It assumes you have done that as well, if need be... After the drive is physically installed and is recognized in the BSD startup, and it mentions both your drives, in the order you expect them... Or perhaps just the one, if you had special problems with installation. Now all you have to do is "disklabel" the drive... Well, what is *THAT*??? The disklabel is used by the kernel and other utilities to tell how you want or have the drive set up *logically*. In a beautiful world, we might have a very free hand at this set-up and expect it to work. Unfortunately, the authors of the software dealing with the hard drives either decided or were forced by circumstance to make certain things about the disklabel inviolate. When you let the installation disk set the disklabel for you first drive it comes out like this: <pre> The a: partition is the primary partition. The b: partition is the swap partition. The c: partition is the amount of the disk used by 386bsd (swap and data) The d: partition is the entire disk (on the PC version only). </pre> Of these, the only one that could be different is a:... (Note for those of us who have spent far too much time using DOS: the labels a: b: c: d: e: f: g: h: DO NOT refer to DOS drives, but to partitions in your 386bsd partition... confusing, eh? For the sake of consistency I will never make a reference to DOS drives except by saying something like "DOS drive C:". ) It's possible to divide up the disk a bit differently, but three things MUST be: c: must refer to every cylinder you wish 386bsd to use, either for your data or the swap space. b: Must always refer to a swap partition. Note that on any other than the first disk it does not have to, but if you enable swapping on that drive, and you are using b: for something else, that something else will be killed. The reason for this is simple: It's hard coded in. "WHY?" you ask? (I did...) Probably time constraints, maybe tradition. But if you look at the code in "isofs" and "ufs" in your sys.386bsd directory, you will see numerous comments asking some of the same questions, which leads me to believe this may change in the future, making our lives both more complicated and easier at the same time... Getting past the esoteric explanations, here is a method for figuring out and "labeling" your disk. We'll start with the disklabel from my second disk, in the form most understandable by humans... #'s signify the start of a comment. <pre> # /dev/rwd1d: type: ESDI disk: maxtor7245 label: flags: bytes/sector: 512 sectors/track: 31 tracks/cylinder: 16 sectors/cylinder: 496 cylinders: 967 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # milliseconds track-to-track seek: 0 # milliseconds drivedata: 0 5 partitions: # size offset fstype [fsize bsize cpg] a: 198400 0 4.2BSD 512 4096 16 # (Cyl. 0 - 399) b: 31744 447392 swap # (Cyl. 902 - 965) c: 479136 0 unused 0 0 # (Cyl. 0 - 965) d: 479136 0 unused 0 0 # (Cyl. 0 - 965) e: 248992 198400 4.2BSD 512 4096 16 # (Cyl. 400 - 901) </pre> Some math: Looking at the comments at the end and the size and offset columns, size is a function of (last - first + 1) * sectors per cylinder: a: 399 - 0 + 1 = 400 * 496 = 198400 b: 965 - 902 + 1 = 64 * 496 = 31744 c: & d: (Since I have no DOS partition, whatsoever) 965 - 0 + 1 = 966 * 496 = 479136 e: 901 - 400 + 1= 502 * 496 = 248992 248992 + 198400 + 31744 = 479136 (all the parts should equal the whole) Some things I discovered (for all you in novice land like me...) 1. As you can see this disk has 967 cylinders, but I only refer to 966 of them, 0 - 965... This is because it's good practice to leave the "Landing Zone" cylinder out of it... This is usually the last cylinder, and it's where the read/write heads hang out when your disk is off... Note from TSgt Dave: Most modern drive heads come to rest on a polished surface inside the highest cylinder. I could be mistaken, of course, and the Hard Drive Bible (or other appropriate reference manual) will tell the tale for each drive. 2. a: can be a regular partition, b: should be swap, c: everything 386bsd will get to use, including swap. d: is the entire disk from 0 - (cylinder_per_disk - 2) [leaving out the Landing Zone] On the boot drive (The drive that actually contains the kernel), a: is the boot partition. On all other drives, it is a regular partition. Regardless of whether you are using DOS or not, the entire a: partition must reside completely within the first 1024 sectors. This is a limitation of the PC architecture. You can then use e - h for your other partitions. I am not sure whether you could specify b: as other than a swap partition and not run into trouble, but you could surely make it a zero sized one starting and stopping on the Landing Zone... Note from TSgt Dave: This is a good idea. Another way to accomplish this is to simply not specify it in the map. 3. Stupid human trick: When doing the math don't forget that 400 - 900 refers to 50*1* cylinders. I did, for a while. No great problem I suspect, but why waste a cylinder... 4. newfs'ing really is that simple if you have the label right: "newfs /dev/rwd?x config_template" where the question mark is the physical disk, the x is a partition letter, and the config_template is the configuration from /etc/disktab for your disk drive. * NOTE: This is a thumbnail sketch; read the man page to verify all of the options and be sure about how to proceed... 5. then fsck the partition: fsck /dev/rwd?x Don't forget that fsck should be run on the RAW device. 6. As long as it checks out, you can then mount it and do disk things with it... 7. Add it to the fstab... (follow the man page). Don't forget that your new swap partition won't work if your kernel isn't configured for it, but it won't cause you any problem to have it there. One last note from TSgt Dave: And I have yet to figure out a way to determine if it is or isn't using the swap partition anyway. There is a program called 'swapinfo' and it is part of the NetBSD source tree. On my system, it tells me that I never use the swap area. :) A note for those trying to use the CCD: to figure out what the disk label should be for your concatenated device, assuming your disks are identical, just add up the cylinders (minus the ones your reserved for the individual disk labels). I know this works for purely concatenated (not striped) IDE disks, I am assuming it should work on stripped SCSI disks. Commonly used definitions: bsize: Block Size: This is the smallest allocatable area on a disk file system, sort of. A file uses the maximum amount of blocks until it can not completely fill up a block. fsize: Fragment Size: This is the size of the 'leftover' data that didn't fit into a full block. For example, assuming a using an 8K Block Size/1K Fragment Size, a 34.5K file, would use up 4-8K Blocks (4 * 8K = 32K) and 3 1K fragments (3 * 1K = 3K). There is 512 bytes of wasted space, since 32K + 3K = 35K, which is 512 bytes larger than 34.5K. If you want to reduce the amount of wasted space, you can reduce your fragment size, but you also reduce the amount of data you read at one time, so your disk performance decreases also. A good setup is 8K/1K for performance, but if you are really concerned about wasted space you can consider using a 4K/512byte filesystem. For further information, find an article that explains the Berkeley FFS in more detail. cpg: Cylinders Per Group, it determines the cylinder group size, which in turn determines the number and location of the alternate superblocks. What other kinds of information do I need if I really want to tune my hard drive's performance in conjunction with a newfs? Having taken Aim's suggestion and changed my newfs values, I think I've now made some empirical observations that suggest that the defaults for newfs should definitely be changed. With all the disks I tested with, -n 1 (which isn't even *documented!*) provided greatly improved performance, as opposed to all other values of -n. I think that with sector-addressed drives with complex physical geometries, rotational position optimization is a technique which is no longer valid. If _anyone_ has _any_ disk larger than 300MB or so (or even a small disk) manufactured within the last few years for which larger values of -n produce better performance than -n 1, I'm very curious to hear about it. I'd be particularly interested in any disk for which the default value produces optimal results. Increasing maxcontig seemed to always improve write scores, but values of maxcontig above 16 seemed to have a noticeable _negative_ impact on read performance. -a 512, for example, on the disk in my machine at home, yielded a peak write rate (4MB file, 8K record size) of 4.7MB/s, much better than the 4.3MB/s value for -a 64, but read performance was reduced from 2.6MB/s to 2.1MB/s. I do not understand why this is the case, and I'd love suggestions. I believe that with rotational position optimization turned off (-n 1), the value of the -r option is of no consequence. I believe that the fact that with the default value for -n, the -r option seemed to have little or no impact on performance serves to demonstrate that rotational optimization does not work correctly on modern drives. The default value of the -d option also produces much worse results than -d 0. I'm probably inexact up above; I believe that -n 1 -d 0 is what turns off rotational position optimization entirely. I'm all for it. :-) I suggest that the defaults for newfs be changed to: <pre> -n 1 -d 0 -a 16 -r 5400 </pre> The -r value just in case someone decides to try playing with rotational position optimization for some incomprehensible reason. Though actually, anyone with a disk where said optimization is a win might want -r 3600 after all. If someone can explain why values of -a above 16 seem to negatively impact read performance, I'm all for making -a very very large, like 512 or 1024 -- in this case the filesystem code will automatically limit maxcontig to the maximum transfer size for a given controller/disk, right? What are some typical such sizes? Why does -a 512 hurt read performance so much, and how can it be fixed? From comments by Larry McVoy, a good implementation of UFS with clustering will yield disk speed on writes, and about 25% less on reads. Right now, on my hardware at least, we seem to _surpass_ slightly the speed of raw writes to the disk device on writes, but on reads we lose big as the maxcontig value goes up, and we seem to lose worst on large file/record sizes, where the raw device delivers about 5MB/s in my case, but with -a 512 I get only about 2.5MB/s under UFS. If you can't guess, I'm incredibly curious as to why the value of -a affects reads as much as it does, or at all, for that matter. Still, we don't do so badly -- with -a 16, we pretty much hit Larry's "good" value on reads of 75% efficiency, and we still just barely surpass the raw device write figures. (I am very, very, very curious as to how this is possible at all. Anyone?) 2.2.2 Common Disk Label Problems. Increasing the *BSD partition size. There is no easy way to increase this swap partition without relabeling the drive. Unfortunately, relabeling usually involves reinstalling. That involves re-doing just about everything you have just finished doing. The good news is that if all you have done is the base installation, you don't have a lot of time and energy invested in the system. Take the time, and make sure that your swap space is at least as big as your memory; many people recommend even larger. There is no real limit to the size that this space can take. If you have two disk drives, you can have space space on both. Simply follow the instructions above, and you will be all set. If your swap space is smaller than your real memory, system core dumps will be disabled. If you have compiled in the VNODEPAGER option in your config file, you can use vnode files for swap space. The precise details are exaplined in the man pages, but the easiest way to start is to include the following line in your /etc/fstab: /dev/vnd0b swap swap sw Defining the file for the vnode is fairly straightforward: vnconfig -c /tmp/swapfile /dev/vnd0b and actually making it swap is as simple as swapon /dev/vnd0b From there, the rest of your questions should be answerable from the vnconfig manpage. I can access the DOS partition on my second disk from Unix but not DOS? Any suggestions? One kinky problem that almost got me was when I tried to disklabel my second drive in order to use the DOS partition on it, and use the rest as swap for BSD (FreeBSD-1.0 Eps, SCSI drive on an AHA1542B, to be exact). The DOS partition was visible from UNIX, but *not* from DOS. What I tried to do: Using PFDISK (from DOS), make one big DOS partition at the start and use the rest for a BSD partition (type 165). Something that came out like <pre> 1 6 0 69 DOSbi # .. 2 165 70 98 unkno </pre> for a 99 cyl drive. Using BSD disklabel generate disk description/label as documented in the FAQ. Make only 'c' (total BSD DOS part), 'd' (complete disk) and 'b' (intended swap) BSD partitions. Problem: When writing the label, disklabel would ask about overwriting DOS partition table. Whether I said y or n, the DOS partition table was screwed up, as seen from DOS (BSD saw the DOS file system very nicely indeed). Cause, solution: BSD disklabel wants to write the label to the start of the 'a' partition; I had *not* defined an 'a' partition (since I was only using the disk for swap). This tells disklabel that the 'a' partition is the start of the disk, which means there is no DOS partition. Disklabel then writes the label at the start of the drive, which is why it talks about overwriting (aha!); this is *bad* for the DOS partition table. One solution is to have a non-empty (e.g. one cylinder) 'a' partition at the start of the BSD part of the disk, and resize the 'b' swap partition accordingly. Now everything works just fine. Note that this solution can be used whenever you want the DOS partition table to be safe and the DOS partition to be mountable. One other fly in this ointment. The disklabel program has historically asked "Overwrite disk with DOS-partition [n]: " then the normal inclination is to believe the prompt and press return for 'no'. The default answer may or may not be 'no'. There are several versions of disklabel where the default answer is actually 'yes' even though the prompt implies that you can press return and get 'no'. In this case, it might be best to assume that the default answer doesn't exist until you have had a chance to actually look at the disklabel code. I want to use my entire 2 Gig drive as the root partition. Why doesn't it work? The easiest answer is the architecture of the machine has gotten you. Because of the limitations of the BIOS, everything the boot process needs must reside in the first 1023 cylinders on the disk. Most really big drives have more 'real' tracks than this, so DOS tries to translate the drive so it doesn't. The *BSD systems don't; they rely on the disk geometry being correct, or at least the same as the controller thinks it is. Once the system is up and running, the BIOS is disabled. This means that the system no longer has that 1023 track limitation. What does this mean to you? Make sure that the root partition (the a: partition above) of your boot drive does not extend beyond track 1023. If you have a large DOS partition that covers nearly all of that, you may need to make a VERY small root partition to make absolutely certain the root does not extend past 1023. 2.5.3 How do I set up the system so that I can boot from more than one operating system/file-loader without using floppies? There are many people that wish to be able to boot DOS or 386bsd at will. There are several programs that allow this. The program "os-bs" is one such program, "BOOTEASY" is another, and there are three or four others. There are problems in some configurations using the os/2 boot manager for this, so beware. In addition to being able to boot from either of two partitions, some people want to operate more than one disk drive (and perhaps boot from either as well). Christoph Robitschko provided one description of this. Since there are virtually limitless possibilities for configurations for BSD systems, it will be impossible to answer all of the possible questions about these features. Many people operate with multiple disk drives on one or more controllers. Yu-Han Ting provides this tutorial on partitioning and booting multiple systems with a single hard disk. 2.2.3 How do I get the system to boot from the second hard drive? Julian Elischer (julian@jules.dialix.oz.au) adds: To make the boot code default to drive 1 look in /sys/{arch/}i386/bootboot.c for the following (or similar. The code may have changed a little and may be in a different directory: <pre> loadstart: /***************************************************************\ * As a default set it to the first partition of the first * * floppy or hard drive * \***************************************************************/ part = unit = 0; maj = (drive&0x80 ? 0 : 2); /* a good first bet */ name = names[currname++]; and change it to: loadstart: /***************************************************************\ * As a default set it to the first partition of the SECOND * * floppy or hard drive * \***************************************************************/ part = 0; unit = (drive & 0x7F); maj = (drive&0x80 ? 0 : 2); /* a good first bet */ name = names[currname++]; </pre> This way, whatever drive the boot blocks are loaded from, it has that as default. In my case, I get wd(0,a) when I have my netbsd drive as C:, and wd(1,a) when I have it as D:. (I've been swapping drives left right and center the last day getting dos to boot on one drive and netbsd on another). 2.2.4 How do I disklabel my second hard drive? The obvious answer is to use 'disklabel -w -r /dev/rwd1d'. Unfortunately, this does not always put a real disklabel on the drive. The symptom is that the drive labels and can be used until the system is reset, at which point the system tries to read the label from the disk. It was never actually written to the disk, so the operation fails. There are also reports that the /usr/mdec files are corrupted in some of the distributions. If you have tried everything else, you can either load the files from one of the many archive sites that keep the /usr/mdec files around, or you can recompile them yourself. Instead of specifying the entire device path name (i.e. /dev/rsd0c), only specify the two letters of the device type and the unit number (i.e. "sd0"). Disklabel figures out the rest, and it works. For instance, the following line works for me: disklabel -w -r sd0 <drive-type> assuming of course that the boot block files are in /usr/mdec/ and the <drive-type> is in the /etc/disktab. This is also a symptom of some of the versions of FreeBSD and NetBSD where the disklabel code was 'fixed' to only write a disklabel on a drive with a disklabel. Oops. Also, some folks want to mix SCSI and IDE drive together in the same system. A report about someone with an Austin Tower (486DX/50), AMI BIOS, Caviar 2250 IDE, Adaptec 1542CF, and Toshiba SCSI disk (1.2GB) posted this set of instructions: The BIOS is configured to boot from the IDE drive as type 47 (user defined). The IDE drive currently has NetBSD 1.0 BETA on it. The 1542CF switches are 1-4 off (open), 5-8 on. The meaning is as follows: <pre> 1(off)=Termination software controlled. 2,3,4(off)=I/O Port x330. 5(on)=disable floppy. I use the Austin floppy controller. 6,7,8(on)=disable Adaptec BIOS. </pre> Note that this means the Adaptec 1542CF on-board setup program is also disabled. If I need to change my SCSI termination, I first have to enable the Adapted BIOS (sw 6,7,8), enter 1542CF setup and change termination, then change switches again. I could not configure the system to boot from the SCSI drive having the IDE as a secondary drive. (Ed Note: There is more news on this front all of the time. Since I personally don't have much interest in doing this (I boot from my IDE drives and mount my SCSI drives) I don't see the problem. ) 2.2.5 NetBSD and FreeBSD cannot handle disk geometry translations, but it turns out that my disk geometry is translated. It has five zones, each with a different sec/track! What kind of things can I do about the disk translation my hard disk controller uses? It turns out that what *BSD cannot handle is not translation, but translation that changes during the boot-up process. For example, the configuration above will work just fine IF the translation that the controller uses when it powers up is the same one that it uses when it boots. On many PC clones, the BIOS loads a different geometry after it boots to make the geometry agree with one that is loaded in CMOS. This is the fatal flaw for *BSD. Fortunately, once the problem has been identified, it is relatively easy to handle. Simply make sure that the BIOS is configured to set the controller to the translated geometry that the card powers up with. There are several ways to get around these problems with disk geometry translation. If you are using a SCSI controller, you can specify the geometry such that each 'cylinder' is 1 Meg (64 sectors by 32 tracks for example). Most SCSI controllers will blithely ignore what YOU tell it the geometry is and press on using this type of 1 Meg cylinder had to get the job done. NOTE: If you are going to try this, try to ensure that each 'pseudo cylinder' is a reasonable size (like 1Meg or 512K). An interesting method for dealing with disk geometry comes from Alan Barrett (barrett@lucy.ee.und.ac.za): This sort of problem happens when you try to install NetBSD in a partition of a disk whose controller does geometry translation. I have not had time to find the bug that causes the problem. One option is to disable the geometry translation: Use ide_conf to find the true geometry, use the CMOS setup program to tell your BIOS about the true geometry, and reformat everything. I successfully did that on one of my systems. If you are not able to, or do not wish to, disable the geometry translation then the following work-around might work for you. This requires that the disk have unused space on {cylinder 0, head 0}, from sector 2 to sector 16. Almost all DOS disks that I have ever seen satisfy this condition, because they usually start the DOS partition in {cylinder 0, head 0, sector 1}, leaving most of {cylinder 0, head 0} unused apart from the partition sector in {cylinder 0, head 0, sector 1}. However, many partitioning programs like to hide this fact from you, and pretend that the DOS partition starts at the front of the disk; don't believe them until you have checked with a raw disk editor. 0. Make sure you have adequate backups. 1. Use a partition sector editor (fdisk, pfdisk, os-bs, booteasy, Norton utilities, whatever) to mark the partition that you want for NetBSD as bootable with type 0xA5 (decimal 165). 2. Halt the system. Boot the NetBSD kernel copy floppy. When it asks you to insert the floppy for the root file system, switch to the Install-1 floppy and press enter. 3. Answer all the installation prompts, using numbers based on the translated geometry. When it asks if you really want to label the disk, be brave and say yes. 4. Halt the system. Boot to DOS. Run a disk editor program, such as Norton utilities. 5.1. Verify that the partition sector in {cyl 0, head 0, sec 1} is undamaged. Verify that the disklabel program run as part of the NetBSD install has written the NetBSD primary boot block to {cylinder xx, head 0, sector 1}, written the disk label to {cyl xx, head 0, sec 2}, and written the secondary boot program to {cyl xx, head 0, sectors 3 to 16}. ("xx" represents the translated cylinder number you chose for the start of the NetBSD partition. You did choose to start on a cylinder boundary, I hope.) 5.2. Verify that the space in {cyl 0, head 0, sectors 2 to 16} is still available. Copy the fifteen sectors containing the NetBSD disk label and secondary boot block from {cyl xx, head 0, sectors 2 to 16} to {cyl 0, head 0, sectors 2 to 16}. 5.3. Edit the partition table in {cyl 0, head 0, sec 1}. Change the system ID of the NetBSD partition from 0xA5 (decimal 165) to something else (I use 0xA4, decimal 164), but keep it flagged as bootable. This will let you boot to the NetBSD primary boot block. 5.4. Edit one of the previously unused partition table entries (I hope you have one), to contain the following information: {sys id = 0xA5, boot flag = 0, start cylinder/head/sector = 0/0/1, end cylinder/head/sector = anything, initial offset = 0, total size = anything}. This will tell the NetBSD primary boot block, or a NetBSD system booted from a floppy, that it should look for the NetBSD disk label in {cyl 0, head 0, sec 2}. 6. Halt the system. Boot the NetBSD kernel copy floppy. When it asks you to insert the floppy for the root file system, just press enter without changing disks. 7. Copy the kernel, and proceed with the rest of the installation as per the instructions provided with NetBSD. It should now work because of the trickery with the partition table etc. 2.2.6 I am having trouble installing on the EIDE hard drive. What are some of the things that I need to look into? Bradley W Mazurek (bwm260@skorpio3.usask.ca) writes: First, I had to change the IDE translation mode in my BIOS. Rather than using LBA, I used Standard CHS. When I went in to repartition the disk for DOS, DOS reported that the drive was only 523Mb (1023cyl, 64h, 63sec/tr), rather than the true geometry (2100cyl, 64h, 63sec/tr) but I didn't worry about it. Next I created my DOS partition. I partitioned the disk so that cylinders 1-999 were DOS. That left cylinders 1000-1023 for NetBSD. Lots of room! :) Anyway, on a hunch, a friend and I were hoping NetBSD didn't look at the ending cylinder entry (1023) of the partition table. Next I calculated the length of the partition from 1000-2100, put this into the partition table using the disk editor. The numbers weren't consistent in the partition table, but DOS ignored the Non-DOS partition, NetBSD was happy...and we've (DOS, NetBSD and my remaining hair) all lived happily ever after.... [Ed.Note. The partition table needs to correctly identify the NetBSD portion of the disk, regardless of whether or not DOS can handle it. See the section on hard drive partitioning for more information...] My suggestion is to try to find an IDE translation mode in your BIOS for which the number of heads and number of sectors per track is consistent with the true geometry of your hard drive. Then perhaps this trick will work. 1. there is _different_ behavior, if one executes <pre> disklabel wd0 or disklabel /dev/wd0c or disklabel /dev/wd0d </pre> It didn't get quite clear to me, what these differences are exactly. 2. Any disklabel write will change not only the data on disk, but also some data-structures in core. For example, if one tries to write a complete different disklabel to a complete different place, say /dev/wd0h, there will be strangeness afterwards. That means, writing a disklabel and then reading it back, does not have to mean that the write did succeed. There is an option -r to disklabel which is said to access the disk directly, but, as I noticed, the core-data is updated thereby, too. The following paper explained to me what should happen in sequence on boot: /usr/src/sys/arch/i386/boot/README.386BSD. It says (in short): [...] 1/ the BIOS loads the first block of the disk (called the Master Boot Record or MBR) and if it has the correct magic numbers, jumps into it: 2/ The MBR code, looks at the Partition table that is embedded within it, to determine which is the partition to boot from. If you are using the os-bs bootblocks (highly recommended) then it will give you a menu to choose from. 3/ The MBR will load the first record of the selected partition and if it has (the same) magic numbers, jumps into it. In 386bsd this is the first stage boot, (or boot1) it is represented in /usr/mdec by wdboot, asboot and sdboot. If the disk has been set up without DOS partitioning then this block will be at block zero, and will have been loaded directly by the BIOS. 4/ Boot1 will look at block0 (which might be itself if there are no DOS partitions) and will find the 386bsd partition, and using the information regarding the start position of that partition, will load the next 13 sectors or so, to around 60000 (640k - 256k). and will jump into it at the appropriate entry point. Since boot1 and boot2 were compiled together as one file and then split later, boot1 knows the exact position within boot2 of the entry point. Boot 1 also contains a compiled in DOS partition table (in case it is at block 0), which contains a 386bsd partition starting at 0. This ensures that the same code can work whether or not boot1 is at block 0. [...] 2.2.7 My disk label is complaining about '256 heads' in the disklabel. This is obviously bogus, but it doesn't seem to be hurting anything. Is it Okay or should I fix it? Steve Gilbert (gilbert@cs.utk.edu) provided us with this answer: First, If you do a "fdisk wd1" (It may be wd1d, I don't remember what it wanted), it will list out the partition table for you. This is something totally different from BSD's idea of a partition, mind you. The last partition (#3) should be BSD. All of those figures are correct except for the "ending head" field which is set to 255 (thus, 256 heads). 1. BACK UP EVERYTHING! 2. fdisk -u wd1 ...this will prompt you for the stuff you want to change. Remember, everything is correct except for the ending head. Accept all the default values it gives you at first. You'll have to tell it that you want to explicitly define the beginning and ending values. 3. My 420 MB Conner drive has 16 heads, so I just enter 15 as the ending head number. 4. When you are back out of fdisk, you can do another fdisk wd1 to make sure the values are correct. Don't worry if you mess up, you can always change it again. Anything you didn't back up is probably gone by now anyway :-) 5. Reboot and watch NO error message pop up! ...remember that all you want to do is fdisk the drive. You do NOT want to run disklabel again or newfs the partitions again. This will write the incorrect 256 crap back. I did this three times before I finally got smart and did it right. 2.2.8 What are the options for the boot up prompt? The options are supposed to be as follows: <pre> -s............... boot into single user mode -a............... ask the user what device to use as root just before mounting it (Not presently supported) -d............... once you have the kernel loaded and VM and such up and going, drop into the kernel debugger. (great for debugging probe code) </pre> A related question concerns the options on the 'reboot' program. These flags are as follows: <pre> -a Ask for a file name to reboot from -s Reboot into single user mode -b Don't reboot, just halt -r Use compiled in Root device -c Invoke the user configuration routines -d Transfer control to the kernel debugger, if available -v Print out all potentially important information </pre> As with so many other things in the systems, each of these may (or may not) work for FreeBSD or NetBSD. Your Mileage May Vary. One other note about 'reboot'. There are some motherboards which do not reboot reliably. Instead of rebooting, they simply hang. While this isn't a definitive answer, some folks have noticed that have the BIOS relocate option set seems to help them, especially with Micronics motherboards. If you are having problems with your system not resetting after a reboot, try changing the setting on the BIOS relocation option. 2.2.9 I am having trouble installing WRT 'syslogd: bind: Can't assign requested address' errors. What are some of the things I should look at? I also am having trouble with the network: 'starting network ... ifconfig: localhost: badvalue'. This is caused by incorrect settings in /etc/netstart and/or /etc/hosts. In /etc/hosts, you must have a line that says: <pre> localhost localhost.{yourdomainhere} </pre> Errors that will result if you don't do this: ifconfig will not be able to figure out what IP address goes with the name 'localhost' and you'll get 'localhost: bad value.' In /etc/netstart, you must do: <pre> ifconfig lo0 localhost route add {hostname} localhost </pre> Errors that will result if you don't do this: the loopback device will not be properly configured and/or you will have no route to it. The result is that programs expecting to have networking enabled (including syslog and friends) will get horribly confused. *AND*, if you're not going to be directly connected to a network, you should change /etc/host.conf to say: <pre> hosts bind </pre> It's set up the other way around by default. I don't like it that way myself. Errors that can result if you don't do this: if you don't have a nameserver available to you, the resolver will have trouble translating hostnames into IP addresses. Bogosity levels will be off the scale. (Note also that if you do have access to a nameserver, you need to set up /etc/resolv.conf to point to it.) By changing the order, you'll be telling the resolver to check the host files for matches *first*, then roll over to the nameserver (if any) if no match is found. Make sure that: - There are no typos in any of the three files mentioned above. - There are no bogus non-ASCII characters in the files mentioned above. - All three files have their read permission bits set. Lastly, be very careful with /etc/hosts.equiv. If you add a hostname to it, say 'otherhost.domain,' then root on otherhost.domain will be able to rsh/rlogin to your machine without a password. Once you have everything set correctly, you should be able to type 'telnet localhost' and establish a connection to yourself. If you get an error such as 'localhost: unknown host' or 'network unreachable' then you still have work to do. 2.2.10 When I start up my system, it hangs for three or four minutes during the 'netstart' program. Our network nameserver is working OK, and I use it all the time; my resolv.conf file says to use the network nameserver. Why would netstart have such problems using it? When the system is starting, the nameserver hasn't started yet. If you are using any names that must be resolved, the system will attempt to get the names from the nameserver, When that fails (three timeouts at one minute apiece) the name will be resolved from the /etc/hosts file (if the name is available). There are essentially two ways to solve the immediate problem. The first is to reduce the number of entries you have in your /etc/hosts file to the absolute minimum you need for booting and change the order for host resolution from 'bind file' to 'file bind'. If you specify a name in any of your start up files and the name server isn't available, you will still have the hang, but this is only a small annoyance. The second (and generally more effective) way to deal with the problem is to use only numeric addresses in your /etc/* files. This way, the resolver is never called upon to figure out the addresses and your boot-up will always 'just work'. This is sometimes more time intensive to set up, since all of the names in the files need to be removed and replaced with numbers. "C'est la vie". 2.2.11 I am having trouble getting net aliases to work. What could some of the problems be? There are many things which will cause network aliases to not work right. Here are a few: - Use "netmask 0xffffff00" (or whatever is appropriate) for the first IP address, and "netmask 0xffffffff" for all aliases that happen to be in the same (sub)net as the primary one. The reason this is right (no matter how odd it may seem) is you have multiple interfaces referring to the same network. You *have* to chose one of the various interface addresses as the "gateway" for outgoing packets into this network, you cannot have them going out through a dozen of addresses simultaneously. The netmask 0xffffffff prevents the kernel from considering this IP address as a valid gateway (since it's not pointing to any network at all). The correct syntax in /etc/rc.local for declaring a net address alias (assuming you are updating the eth0 interface) is: <pre> ifconfig eth0 xx.xx.xx.xx netmask alias route add -host xx.xx.xx.xx localhost arp -s eth0 yy.yy.yy.yy.yy.yy proxy </pre> Where the xx.xx.xx.xx are the host address for the alias and the yy.yy.yy.yy.yy.yy is the interface MAC address (if appropriate). 2.2.12 I'm having trouble with the networking code (specifically the PPP stuff to my ISP). How can I debug NetBSD's networking? Bring the PPP connection up again. Retry whatever-it-is that's failing. PPP includes a link-level checksum. Watch the packet counts in the netstat -I ppp? output over time. Check carefully to see whether the PPP driver is recording input errors (frames whose CRC failed.) Frames with bad checksums are counted in Ierr field. A non-zero count indicates _something_, possibly overruns, is in fact garbling your PPP traffic. If the packets are being discarded due to errors at the PPP level, they'll never even get as far as IP. Also, use netstat (or an SNMP daemon and monitor, if you prefer) to watch the rate of change of bad packets at the IP and TCP level. I run "netstat -p ip" "netstat -p tcp". One has to manually compute the rate of change; netstat's -i option means something different to, say, vmstat's. (Adding periodic sampling and rate-of-change to netstat would be a Cool Project.) At the IP level, the relevant statistics are <pre> 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with header length < data size 0 with data length < header length 0 with bad options 0 with incorrect version number [...] 0 output packets dropped due to no bufs, etc. </pre> At the TCP level, look for, e.g., <pre> 0 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short </pre> Any of these being non-zero would support the hypothesis of a bug in the PPP implementation. Unlikely, but one never knows. It could be that a TCP ack got munged or dropped by your PPP link; or possibly somewhere else in the Internet. That's not abnormal on busy links. What OS is your FTP peer running? Is it a pre-2.0.0 Linux or an older version of a commercial Unix? If so, have you tried turning off rfc1323 on your NetBSD machine?? 2.2.13 I want to hard wire my SCSI devices to a particular device number. Is that possible? You can do the numbering any way you please. Say I had two controllers. You could number them as: <pre> sd10 at scsibus0 target 0 lun ? sd11 at scsibus0 target 1 lun ? [...] sd20 at scsibus1 target 0 lun ? sd21 at scsibus1 target 1 lun ? [...] </pre> Of course, you will need to add devices to the /dev/ directory for each of them, pointing to their correct major and minor numbers. You can also hardwire the 'scsibus' numbers, by doing something like the following (assuming "whatever" is the SCSI host adapter driver's name 8-): whatever0 at whateverbus? [whateverbus config info] scsibus0 at whatever0 then <pre> sd0 at scsibus0 target 0 lun 0 </pre> etc. That syntax won't work on ports which use 'old config,' but I believe an appropriate description of how to do it on them has already been posted. The most common configuration for locked down drive numbers is actually: <pre> sd0 at scsibus0 target 0 lun 0 sd1 at scsibus0 target 1 lun 0 sd* at scsibus? target ? lun ? # SCSI disk drives </pre> You can do the same thing with your tapes, CDs, and other SCSI devices as well. <pre> st0 at scsibus0 target 6 lun 0 st* at scsibus? target ? lun ? # SCSI tape drives cd0 at scsibus? target 5 lun 0 cd* at scsibus? target ? lun ? # SCSI CD-ROM drives etc. </pre> 2.3 Common installation problems. There are many common installation problems. This section covers the most basic and common of these problems. In addition to this section, you should also read through the other sections of the FAQ, since many of the less common questions are answered in other places in the doc. 2.3.2 Endless reboot cycles. Another incarnation of this symptom is that the disk geometry on your disk label (as installed by install) is different than the geometry your hard drive controller thinks it is using. This will most often manifest itself on controllers that insist on operating in some type of translation mode. Normally the fix is to find out what the controller geometry is and make the disk label agree. There are programs available to help with this problem. 2.4 The computer just sits there, or 'that isn't right'. This class of problems is sometimes caused by an incorrect FTP of the boot disk. Make sure that the files were grabbed in 'binary' mode and that the size reported back is 1244000 bytes. Use the Unix program 'dd' or the DOS program RAWRITE to put these files onto the diskette. In addition, this is the 'miscellaneous' section of the FAQ. These problems are included here because they are usually preceded by 'I just finished installing...' Another incarnation of this problem is that, sometimes, the major or minor device numbers for a particular device may not get made correctly in the install (or upgrade) procedure. If you have a problem where you can install and everything seems to go well until you try to boot onto the hard drive, try running the MAKEDEV script that resides in /dev. More the file to see what kind of options you should include (if the sd0a drive needs to be fixed, for example, the command './MAKEDEV sd0' should get your devices back on the road. If that doesn't work, try one of the many things below. It could be any (or all) of them.... 2.4.1 The boot disk works all right on one computer but not another. This could be a problem with many different pieces, some of which are: - Misconfigured hardware. The iomem, IRQ, and other board settings must match the ones listed in the INSTALL.NOTES. Unfortunately, the INSTALL.NOTES are on the disk that will not boot. You can grab them via FTP from many archive sites. This installation file may have many names. Look for something kind of obvious (like 'install.notes' or 'readme' or 'configuration guide') and you should find it. Finally, there have been many reports (particularly with the BusLogic SCSI cards (specifically reported was the BT445C VLB host adapter) where the system seems to boot up, but starts getting 'stray interrupt c' messages. This is usually caused by people having there SCSI card set up on some IRQ other than the one that the kernel expects. The factory default for this card seems to be IRQ 12, but the kernel wants the card at IRQ 11. Setting the card (using the configuration program supplied) changes the setting so that it matches the kernel and the card then works. - Unsupported hardware. There are several SCSI controllers on the market that are not fully supported by 386bsd. This is due in large part to the way these controllers work. Instead of using a standard interface and command set for the controllers, most manufacturers make up their own controller interface language, which is then translated into SCSI commands which are interpreted by the drives. - One or more of the devices in the /dev directory on the intended root partition was either not created correctly or was not created at all. Run the program MAKEDEV in the /dev/ directory to ensure that all of the correct devices are built. 2.4.2 Really strange errors in the various *BSD flavors. Using the new code in NetBSD, I get a "panic: pdti 206067" in pmap_enter(). What should I do? Increase NKPDE in /sys/i386/include/pmap.h. The largest it should be 31; see question for other useful values. Be sure to keep the changes around as a patch file, since this is one of the files that will get overwritten during a source code update. Note that in the versions of NetBSD newer than 1.2.1, this value is computed, so you won't need to change it. 2.4.3a I get the error "isr 15 and error: isr 17" on an NE2000 card. 2.4.3b I have some card on IRQ2 and it doesn't work; why? 2.4.3c I am getting lousy performance out of my network card. What are some of the other possibilities? The description of this problem is that one of the cards in your system (most likely the VGA card) is either generating interrupts or is causing the IRQ 2 to be actively disabled. Older VGA card used IRQ 2 during vertical retrace to prevent sparklies. One solution would be to plan on not using your Ethernet card (or any other card you want on IRQ 2) until you have rebuilt the kernel so that it expects it at an interrupt other than IRQ 2 or 9, re-jumper or reconfigure the card to match the IRQ you have selected, and enable it. From time to time, this problem will manifest itself as a general tendency of the network card to transfer either very sporadically or very slowly. It is precisely the same problem. James Van Artsdalen (james@bigtex.cactus.org) has offered at least one solution: If this is the problem, you can use Scotch tape to cover the IRQ 2 signal on the VGA's ISA connector. There has been some discussion as to whether scotch tape is really appropriate inside a card slot. My answer would be "yes". This is because the alternate solution of cutting the trace on the video board seems, to my mind, to reduce the value of the board. It is possible that, in the future, with a bi-partite driver, you would want to catch the retrace interrupt to get rid of "sparklies" or to implement a driver for a very high resolution monitor for X. If this happens, given a choice between alcohol and solder, I vote for alcohol. One other thing to look for (if your video card seems to be the culprit) is a jumper which enables or disables the card's IRQ 2. Newer cards may have a jumper of switch which does this, so take the time and look for it before you get the razor blade out. Either way, you will probably find that your VGA card uses IRQ 2 strictly for compatibility with older cards. With the advent of dual-ported memory for video cards, virtually all of these types of problems have disappeared. 2.4.4 What is the difference between IRQ2 and IRQ9? Are they really the same, or are they really different? On the XT, there was one interrupt controller, an Intel 8259, which handled 8 interrupts numbered IRQ0 through IRQ7. IRQs 2 through 7 were accessible via bus lines IRQ2 through IRQ7. The AT had two interrupt controllers. Due to the design of the 8259, one has to be the master and the rest (up to 8) must be slaves. Each slave controller output its interrupt request to and input on the master controller. In the case of the AT, the master controller handles IRQ0 through IRQ7. The slave handles IRQ8 through IRQ15. The interrupt request from the slave to the master goes through IRQ2, which is termed the cascase input. IRQ2 was chosen to allow future compatabilty with the old XT hardware; it was the first IRQ that was 'available'. This means. of course, that the bus line for IRQ2 could no longer be used for external interrupts. Instead, the bus line that WAS IRQ2 in the XT became IRQ9 on the AT. This whole issue is confused further by the fact that some vendors refer to this external interrupt as IRQ2, while others refer to it as IRQ9. In either case, if you are talking about an external interrupt, it means the same thing. BTW, IRQ8 is used for the Real Time Clock, and does not have an external interrupt. Here is a map, in case anyone still needs it: <pre> Internal External Function IRQ0 n/a Refresh/Timer IRQ1 n/a Keyboard IRQ2 n/a (AT only) Cascade Input to Master IRQ3 IRQ3 Free (Com port) IRQ4 IRQ4 Free (Com port) IRQ5 IRQ5 Free IRQ6 IRQ6 Floppy Controller IRQ7 IRQ7 Free (Printer/Sound Card*) IRQ8 n/a Real Time Clock IRQ9 IRQ2 Free (Network card) IRQ10 IRQ10 Free </pre> etc. * NOTE: The IRQ7 entry is spooky. If you use the Interruptless printer driver (either from 386bsd, NetBSD, or FreeBSD) then you can still have an interrupting device (like a sound card) on interrupt 7. Basically, you can as many devices on each IRQ as you want, but only one of them can be 'actively' interrupting. 2.4.5 Some of my SCSI devices (like a tape drive) don't work; why? Even with the newer systems, you run the risk of having a problem with a SCSI device from time to time. There are some cards (like the new Adaptec 27* series) that software drivers are either not in the works or the documentation is simply unavailable. Another culprit here is that some machines are very touchy about the quality and length of cables, as well as SCSI IDs. There was one report of a older hard drive that took a little longer to spin up than the rest of the drives in the chain. Whenever this drive was put early in the ID string (like 1 or 2) it would be 'not found' but if it was placed near the end (like after the tape drive) it would have spun up and been found. 2.4.6 I want to use the Adaptec 1542C SCSI controller. What are the problems/tricks you need to know to get it working? The first thing to check when trying to use the 1542C is the setting of 'Enable Disconnection' under the 'SCSI Device Configuration' menu. It should be set to YES for all devices, as the manual warns you. Matthias Urlichs (urlichs@smurf.ira.uka.de) has provided this description of the types of things that can cause problems for the controller and devices attached to it. The problem is that the Adaptec 1542C has (a) rather powerful line drivers, and (b) is sensitive to transient signals which can be induced by them via either a bad cable or a bad external terminator. A bad cable is almost any cable which doesn't meet SCSI-2 specs. A bad external terminator is one which doesn't adequately buffer its resistor network. So... - Remove the internal terminator from the last drive in your chain. Replace with an active SCSI-2 external terminator. Side improvement: active terminators consume a bit less power. - Check cables. Specifically, some cables carry less than the nominal 50 signal wires. Manufacturers sometimes think they can get away with this because almost all odd-numbered pins are GROUND anyway. So, if pins 1 and 3 or 3 and 5 are connected, you're likely to have a marginal cable. - Make sure that the terminator power is supplied by all devices and that the power pin is actually connected on your cable. The problem here is that some idiot device manufacturers save on 2-cent diodes, which means that the thing will pull terminator power to ground if it's not plugged in. (Two of these on one bus are even worse.) - Consider creating your own cabling. Take a 50-wire flat ribbon and press the appropriate connectors onto it in precisely the right places. (Move your devices as to minimize cable length.) Be aware that if a device has two external connectors, you must take the SCSI bus in at one connector and out at the other -- don't leave the other connector dangling; this isn't within the SCSI specs because the cable usually is too long. - Better but more expensive: use 2-twisted cable. (I.e., wires 1&2 are twisted around each other, wire 3&4, ...) This will improve reliability because the wires are twisted at different rates. These cables have short non-twisted segments every 50 cm (1.5') so that you can press on your connectors instead of heating up that soldering iron. - While you're rebuilding your system anyway...: If you have more than one drive per power supply, check if these drives have adequate condensors to buffer their power. I have two 80-MB Seagates which refused to work more than a few hours without glitches -- then I soldered two 10-uF Tantals onto their power connector and they've been flawless ever since. The terminator power is pin 26. Be aware that SCSI counts pins as they appear on a ribbon cable, not as they're sometimes numbered on the connectors. Pin 25 is supposed to be disconnected. 2.4.7 Is there a SCSI utility which works to fix up the random problems I sometimes have with my drives? That depends on the problem. One of the first things you can try is Ian Dell's (Ian.Dell@dsto.defence.gov.au) SCSI Disk Doctor (sdd) package. There are NetBSD i386 and Sparc executables on ftp://ftp.mono.org/pub/sdd. FreeBSD uses a couple of utilities which come with the system (scsi and scsireprobe) to accomplish some of the same operations. Try one of those (obviously based on your system type) and see if they don't fix your problem. If they don't, then the prospects are pretty grim for your drive. 2.4.8 My system boots OK off the floppy, but once I try to boot from the hard drive, the message "changing root device to sd0a" appears and the system hangs. What is the most likely thing that I have done wrong? A common cause for this is when all of the right devices aren't created on the root partition. Since you say you can boot off of a floppy, do so and check to make sure everything in /dev exists. You might consider running "MAKEDEV all" to be sure everything is created. (Ed.Note: I find that whenever I create a new kernel, it isn't a 'bad' idea to run a precautionary MAKEDEV to make sure that the devices are created correctly. Since I only build a new kernel about once a month, it isn't a very costly insurance policy.) Also, there are known problems with IRQ configurations and the PCI bus. The system hanging right after the changing root device message usually indicates a misconfigured IRQ for the controller. The initial probes by most (all?) drivers are done in polled mode, only when mounting the disk for real does the kernel begin to do interrupt driven I/O and DMA. Is this system a PCI system? Is the SCSI controller a PCI board? If so, make sure the IRQ configured in the PCI BIOS matches the IRQ configured for the card. Also, with PCI, forgetting to enable the slot for "master" in the BIOS setup or motherboard jumpers or putting a bus mastering card in a slave only slot will give similar symptoms. The system may not have problems under DOS because some SCSI BIOS or device drivers don't actually use the DMA or bus mastering features of the card... {sigh}, they run in PIO mode under DOS. 2.5 Other common problems that are attributed to the installation process but are caused other places. 2.5.1 I want to use more than 16 Megabytes of memory. Will any of the BSD based systems support it? When using NetBSD and FreeBSD, there is no SOFTWARE limitation on more than 16Meg of memory. There are still hardware limitations. The limit is caused by DMA controllers which copy memory images around the system. Many cards which people use in VESA and EISA machines either emulate ISA cards (in order to work with *BSD) or are really ISA cards. There are reports of people having trouble with more than 64Meg of memory, but anyone rich enough to have that kind of memory should be paying for his OS. :-) Recently some folks have been reporting that they are getting warnings like these: <pre> hostname /netbsd: sd0: not queued hostname /netbsd: aha0: DMA beyond end of ISA hostname /netbsd: sd0: not queued aha0: DMA beyond end of ISA </pre> This error is caused when the buffer for I/O is beyond the address range that the ISA bus can reach. With 16M you should be okay, however, some motherboards do reclaim all or part of the "lost" 384K (from the I/O "hole" from A0000-FFFFF) and put it just beyond the end of the rest of the memory, so you actually get 16M plus a little bit. One fix is bounce buffers. FreeBSD has implemented this, and NetBSD will as soon as they come up with a method that is compatible with all of the machines that NetBSD supports. Another fix is to either turn off the reclaiming of the extra memory (most motherboards that do this allow you to disable it), hack machdep.c to force the physical memory used to 16M, or use a 32 bit bus (EISA, VLB, or PCI) controller. Jordan K Hubbard (jkh@thrush.lotus.com) has provided this explanation of the distinction: Just so long as you're using a DMA-using disk controller in EISA mode, rather than ISA mode, you can use more than 16 Meg of memory. For those who may find such a distinction confusing, let me explain: You can use an ISA controller (such as an Adaptec 1542) in an EISA machine, but as it will still think it's in an ISA box and refuse to use the extra address lines, this is no different than having an ISA machine as far as >16MB is concerned. You can use an EISA controller in "ISA mode", meaning it uses the older protocols for compatibility reasons (examples being Adaptec 1742 in "standard" mode, DTC 3290 in "Adaptec" mode, etc.) and again, does not use the extra address lines. The only way to get full EISA, 32MB-of-memory-and-everything, mode is to use an EISA controller in full EISA mode (for Adaptec 1742, this is "enhanced" mode, for DTC 3290 it's "DTC" mode, the Ultrastor 24F in EISA (rather than IDE emulation) mode, etc.). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - In addition, several other types of ISA controllers which do NOT use DMA will not cause problems. IDE, ESDI, and RLL controllers are examples of this type of card. The discussion above also applies to VESA and VLB cards. So, the bottom line is that you are limited to the amount of memory that your DMA equipped devices can access. Once again, the weakest link is the strength of your machine. 2.5.2 I tried to use a device in my computer that should be there. When I did, I got a "Device not configured error." What do I do now? Garrett A. Wollman (wollman@emba.uvm.edu) provides us with this brief discussion in answer to a specific question. It wears well as a generic answer as well. When any program tells you ``Device not configured'', it's trying to tell you something very important about what you tried to do: namely, that the device you tried to access is not configured into the running operating system. This is the error message corresponding to ENXIO. There are three major causes for this error: 1) The kind of device you requested was not configured into the system. This is R.W.'s problem; the generic kernels are not distributed with the Berkeley Packet Filter enabled by default. To correct this, you must add the appropriate device or pseudo-device to your kernel configuration file and recompile. (In this particular case, `pseudo-device bpfilter number-of-network-interfaces'.) 2) The kind of device you requested was configured into the system, but either the device you requested would use more than the maximum you configured into the system, or if a physical device, was not found during autoconfiguration. To solve this, either change your configuration file, or change the I/O settings on the device to match what the file says. 3) The major or minor device number specified by the device's entry(ies) in /dev is incorrect. To solve this, re-MAKEDEV the device (read the MAKEDEV script for more details). Hopefully whatever change caused the kernel's internal device tables to get changed also updated your MAKEDEV script; otherwise, you will have to grovel through the kernel to see what is going on. 4) A special case: Although the 'c' drive on most BSD disks is the entire disk, in many NetBSD and FreeBSD systems, the entire drive is the 'd' disk. This special case is wired into many programs, and is recognized by the kernel. From time to time, folks will try and access the 'c' partition on their harddrive, only to be rebuffed with a 'device not configured' error. Mostly, the 'c' partition is unavailable simply because the partition type is 'unused' even though it is allocated and has space set aside for it. 2.6 Customizing the system to meet my needs. 2.6.1 How do I get the system to not display the machine name, but display our company name? Modify the /etc/gettytab file so the default profile uses this: <pre> :im=\r\n Company Name (%t)\r\n\n:\ </pre> 2.6.2 I have a program that, under normal circumstances, starts once a second. This regularly causes inetd to terminate the program with a 'server failing (looping), service terminated' error. How do I fix this? The inetd program has a 40 start per minute limit for all programs started out of inetd.conf. You need to add a 'max starts' option on the end of your 'wait' or 'nowait' option. For example, try 'nowait.100' if you expect the program to start 90 times a minute. -- Dave Burgess Network Engineer - Nebraska On-Ramp, Inc. *bsd FAQ Maintainer / SysAdmin for the NetBSD system in my spare bedroom "Just because something is stupid doesn't mean there isn't someone that doesn't want to do it...."