| 1. Remain calm To successfully handle this incident, you must remain calm. A compromised system is a call to action, but not a cause for panic. The intruders have probably been in your computer for days or possibly even weeks, so another few hours won’t make any difference. Take a deep breath and a few moments to read the following eight emergency steps, even if someone has notified you that the cracked system is being used to attack another site. Once you’ve calmed down and reviewed the emergency steps, you can consider disconnecting the affected system from the network and following the six steps of incident handling as they apply to unix trojan programs. 2. Take good notes 3. Notify the right people and get help 4. Enforce a "need to know" policy 5. Use out-of-band communications 6. Contain the problem 7. Make backups 8. Get rid of the problem 9. Get back in business | ||
| Incident Handling | ||
PHASE 1 - PREPARATION Step 1.1 Establish An Incident Response Toolkit Action 1.1.1 CDROM Burner Action 1.1.2 Build Your Toolkit Note 1.1.2.1 Statically Link Executables Example of dependency discovery using the ldd command: # ldd /usr/bin/passwd libdl.so.2 => /lib/libdl.so.2 (0x40018000) libpam.so.0 => /lib/libpam.so.0 (0x4001b000) libpam_misc.so.0 => /lib/libpam_misc.so.0 (0x40023000) libpwdb.so.0 => /lib/libpwdb.so.0 (0x40026000) libc.so.6 => /lib/libc.so.6 (0x4006e000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) libcrypt.so.1 => /lib/libcrypt.so.1 (0x4015c000) libnsl.so.1 => /lib/libnsl.so.1 (0x40189000) Notice that /usr/bin/passwd has dependencies that are not at all surprising -- libcrypt, libpam (PAM=pluggable authentication modules) -- and the ever present libc and ld-linux for Linux-based systems. Should any of these libraries become corrupted or worse, compromised, it would affect the usage of /usr/bin/passwd and any other files that links against it. Note 1.1.2.2 Load Path Note 1.1.2.3 Tools PHASE 2 - IDENTIFICATION Step 2.1 Assign A Person To Be Responsible For The Incident Step 2.2 Determine Whether Or Not An Event Is Actually An Incident The quickest way to verify that a system has been compromised is to check the integrity of system binaries by looking for a change in one of the file attributes. Size, owner, creation date, and permission are examples of attributes that may show signs of tampering. Of course, the easiest way to do this is to use Tripwire. Simply compare the cryptographic hashes of the current system binaries with those that were generated when the operating system was initially installed. If you didn't create a Tripwire database when the system was pristine, that is too bad. Your job is harder now, but there are still a lot of ways to determine that a system has been compromised. One way is to examine your syslog files, process table and file system to see if there are any "odd" messages, processes, or files. Be on the lookout for any activity that includes:
One should also check for extraordinary files in /dev or /devices -- especially if they look like other proper entries or begin with a "." (dot) or space. Keep in mind that a lot of exploit directories use the names "..." (3 dots) or " .." (space dot dot). This naming scheme is an effective way to keep them hidden from normal view and it avoids raising red flags. Don't forget to check for these hidden file/directory names in any other world-writeable filespace, including: /tmp, /var/tmp, /usr/spool, and /var/spool. Always check the password and shadow password files for accounts that don't belong. Also, keep an eye open for accounts with passwords that shouldn't have passwords, or accounts that should be locked but aren't. The system's last-log file can tell you if any of your "regular" users are logging in at times that are not normal for them. Since a trust relationship is usually exploited in order to compromise a system, it’s always a good idea to take a look at the /.rhosts, /etc/hosts.equiv, /.ssh/known_hosts as well as all user's .rhosts and .ssh/known_hosts files for entries that do not belong. If you see anything that appears suspicious then install a sniffer on a second host and watch for connections to and from the (possibly) compromised host. At the same time make a backup of the compromised machine to use for later analysis or evidence. If you have any of the weird file names mentioned above then you almost certainly have many other problems on that system. You could be lucky and find out that it's "just" an IRC server, but don't let that fool you. We have seen this pattern many times -- there is usually a packet sniffer installed somewhere or somehow on the compromised system. In any event, if you find any type of evidence similar to the patterns described above contact your local CERT for assistance in examining the other hosts in your network and recovering your site. Step 2.3 Be Careful To Maintain A Provable Chain Of Custody And then 6 months or a year later (or longer), the perpetrator is identified. Great! You dig into your non-locking file cabinet in your cubicle and hand over everything you have. But wait! It's not admissible as evidence. There were no witnesses that can testify that you did everything by the book (whatever that is). None of your notes are dated. Too many people had potential access to this stuff and there is no proof that its integrity is still intact. You never know when a compromised machine will come back to haunt you, or if any of the evidence you collected will be needed in a court of law. You should try at all times to maintain a provable chain of custody of all data collected. You should read (or at least peruse) the SANS ID-FAQ question dealing with court-admissible evidence and/or its references for a much more in-depth answer to how to do this. Get your lawyer or legal department involved. If you can't get them involved quick enough or they don't have the experience, at the very least you should do the following:
Be prepared to show how you were logging things before and after the compromise/breakin. Did you ramp up your logging after the fact? No? That’s too bad. Did you ever look at log files before now? No? How unfortunate. Have you been monitoring this stuff all along? No? Too bad. Did I tell you to sign, seal and date a copy of everything? Step 2.4 Coordinate With The People Who Provide Your Network Services Provide them with as much information as you can. Dates, times, IP addresses, MAC addresses, patterns, messages. Do not notify them by e-mail unless you know for sure that it's secured. Contact them by phone, fax or in person. Don't tip your hand. You want to be able to ascertain the four W’s of the compromise. This can't be done if the perpetrator has infiltrated the ISP without their knowledge and is reading the e-mail. Step 2.5 Notify The Appropriate Officials So, who are the appropriate officials? If your site is security-conscious, then there should already be an Incident Contact List filled out for your organization. It's just a matter of going through the list and contacting the people. Unfortunately, if you don't have a list then this simple task becomes much more difficult because of the stressful situation. If you didn’t notify your boss early on, then now is the time. Also, ensure that Network/System Security Officer at your site has been notified. Your legal department is another entity that should be notified. If your ISP hasn't yet been brought on board, contact them. Notify your local CIRT or FIRST team. Contact your local law enforcement and see if there is a person assigned to a Computer Crime section. Your local FBI field office should be notified. You are encouraged to report the incident to cert@cert.org for correlation purposes. How important is it to make sure the proper officials are notified? If law enforcement is involved from the beginning, there may be a better chance of keeping a provable chain of custody. The people contacted may be aware of other incidents similar to yours that will help in your efforts to contain and eradicate the problem. You will be assured of witnesses to what you do. You might have guidance as to what's proper and what's not in an ongoing investigation. Of course, you may be the one doing the instructing as events unfold. Security is not a one-way street -- the more aware your officials become, the more receptive they become to precautions. PHASE 3 - CONTAINMENT Step 3.1 Deploy The On-site Team To Survey The Situation One of the first things the on-site team should do is interview the person that discovered the intrusion and the incident handler. The on-site team will need to know how the intrusion was discovered, what has been done so far, how widespread the intrusion is and who has been contacted (or who else knows about the intrusion). The INCIDENT SURVEY FORM should help tremendously with this step. The more information the incident handler can give the on-site team, the better the on-site team will be able to deal with the intrusion and determine where to begin. Time is precious, so it is essential that the on-site team doesn’t duplicate the work that the incident handler has done. Once the on-site team has assessed the situation, they should be able to make an educated guess about the direction to proceed. The area should be secured if at all possible, unwanted volunteers get in the way and affect the response team’s performance. They should be allowed to work without being interrupted. Physically stopping the machine at the wrong time can be just as deadly to your data as not doing anything at all. Typographical mistakes are not needed here, there is no "undo" when you are superuser. Step 3.2 Keep A Low Profile It is equally important to keep a low "electronic" profile. Resist the urge to use ping, finger, traceroute or whois in order to gather information about the source of an attack. Don't scan it with nmap or telnet to any of its ports. They will be monitoring for this kind of activity and will know that their compromise has been discovered. Step 3.3 Egress Filtering Step 3.4 Avoid Potentially Compromised Code It is pointless to reload the system from a possibly compromised backup. Unless you can establish with a fair degree of certainty when the system was first compromised, you are much better off rebuilding the system from the distribution and (re)applying all patches - especially all of the security patches. Similarly, you should rebuild local code that you run - including your mail server (if you use the Sendmail, Inc releases, or qmail or whatever), your web server if you use apache or another locally compiled server, and your ftp server if you use the wu-ftpd or equivalent. If you were running tcp wrappers, you should rebuild tcpd from a known clean source kit. Intruders rarely break into a system, and manage to compromise the root without leaving themselves backdoors. You don't want to go through all this twice. Make sure you don't use tainted code. See "Steps for recovering from a UNIX Root Compromise" especially the section "Improve the security of your system and network". Action 3.4 Backup the system First, run the ps command with options to get a full listing, including the parent process id of each process. Send the command’s output to a file. Next, use the netstat command (or the equivalent on your system) with the options to report all open connections and not to do lookups of numbers to names. The results of this command should be sent to a file as well. Don’t forget to backup the /tmp directory. This directory is usually cleared when the system boots up. So, tar /tmp to tape or, if you have enough free space, to a file. At this point simply halt the system. There is little more you can safely do while the intruders have control and it is dangerous to try to play "cat and mouse". They know far more than you do since they installed the tools they are using. Your main interest is recovering control of the system with as little additional damage as possible. Bring the system back up in single user mode from known clean system files, as you would have if you boot from a CD of the system software. Now that the problem has been contained and control of the system has been regained, we can discuss how to perform the actual system backup. The three archival commands we will discuss here are TAR, Dump, and DD. As we will see, each is suited for different types of backups. Combined they form a versatile toolkit for performing backups. Some information on syntax - the dash proceeding option flags for tar and dump are optional. Dashes however, are not used with dd. In an incident-handling mode if you are only going to try to save the data, tar may be most appropriate choice. However, if you want to save the disk image for forensic information, DD is your best bet. For fast backup, nothing beats a disk to disk dump. | ||
Step 3.4.1 Tar Tar, when used with the -p flag, will preserve access information. If you administer a heterogeneous environment, it may be important to try to extract your tar files on the same platform as it was created on. This is because some operating systems such as Solaris support Access Control Lists. Others such as LINUX do not. If maintaining ACL controls is important for you at your site, note that the information will be lost. Another thing to keep in mind when creating a tar archive is the use of absolute Vs relative path names. Tar files are restored to locations based on how they were put on the tape. If they were created using absolute path names, they will be restored to the same location. Otherwise they are restored relative to the current working directory. To illustrate the significance, here is a true story. At the site where I used to work, we routinely got deliveries of software from our contractors. Unfortunately, one company was lax in their documentation especially when it came to installation notes. The normal course of action with a new delivery was to unload it to a "test" area, where the code would be tested prior to being put into production. The current version remains in use until the code is tested. One day, I was given an update to install. I extracted the tar file that was delivered. Since it was backed up using absolute path names, the current version wound up being overwritten. I had to restore the original version, move it to a temporary location, extract the new files, move them to a test directory, and move the old version back to where it belonged. Moral of the story: - know what you are extracting, make sure you know where the files are going and that the files don't already exist on disk. Otherwise a 15-minute task could take you all afternoon. | ||
Whether using relative or absolute pathnames, caution should be used. If absolute pathnames are used, make sure you do not accidentally overwrite files on disk. The next slide shows a snippet of code that can be used is a shell script to check that the files that are on a tape will not overwrite any files without you knowing it. Alternatively, if relative paths are used and the files go to the directory you are in, you need to make sure that is where you want them to wind up. A common mistake is to untar the file while still sitting in a directory full of files like /usr for example, and then having to "relocate" the files that do not belong there. Here is one way to ensure that you don't overwrite files. First, we use tar with the "t" option to extract a file listing and save it off to a temporary file called tar_listing.out. Then we read the contents of the tar listing, extract the filename with the cut command, and test to see if a file by that name exists. If so, print a warning and save it off with a.orig extension. This way we can be proactive when we restore files and not just cross our fingers and hope for the best. As a rule of thumb, it is recommended that you use relative path names, extract to a temporary directory and then copy files to where you want them to permanently reside. This way, you avoid overwriting a file by accident | ||
Action 3.4.2 Use Dump First, a full dump should be run after an upgrade or re-install of the operating system. This is because dates on files represent when the files were "mastered", not actually copied to your system. Therefore their creation dates relative to your dumps will be out of sync. In other words, the files you install will be NEW yet could have older time stamps than the files they are replacing. The /etc/dumpdates file will not be accurate, and Incremental dumps will not pick them up as being changed files. Second, Tar only requires that you be able to read the file in order to archive it. Dump accesses the raw device that typically is readable only by root, so non-privileged users cannot run it (without use of sudo, or set UID script). And third, Dump only supports local UNIX file-systems. Cannot dump NFS mounted partitions. If you need to dump a remote partition, run dump on the system serving it and use hostname:device to specify a remote tape device. Aside from Full Vs. Incremental you really have no control over which files get dumped. A level 0 dump captures an entire file system. Incremental dumps (levels 1-9) record files modified since a dump of lower level. Dump uses the /etc/dumpdates file to record what level dump was done on which file system and when. Dump also keeps track of the amount of media used. When dumping small partitions to tape, you can usually rely on defaults but if dumping large partitions (several GB's) to large capacity media, you may need to specify tape length, density or both. Tape drive vendors can usually assist with these parameters. | ||
The simplest form of the dump command is dump, dump level, u (update dumpdates file) f (device name) and the file system to dump. The last parameter may be specified as mount point like /usr or a disk device name - /dev/hd0a. If you need to specify other parameters, they must be in the order of the flags used - ex. 2 uses size and density, so that must be the order of the parameters. Example 1 is a full dump of the /usr file system. Full dump of /usr: # dump 0uf /dev/nrst0 /usr Example 2 dumps /usr to a 20 GB tape drive. Incremental (level 2) dump of /usr to a 20 GB Travan tape drive # dump 2usdf 740 106400 /dev/nrst0 /usr | ||
Action 3.4.3 dd # dump 0uf /dev/nrst0 /usr The input file for dd can be a tape or disk device name. This enables you to make tape to tape copies without having to unpack the archive. Additionally, you can do conversions on the blocks of data permitting you to swap byte pairs - enabling you to go between SGI and other UNIX variants. Other conversions include changing upper to lowercase data, ASCII to EBCIDC, and others. Refer to the man pages for a complete list. Step 3.5 Determine The Risk Of Continuing Operations Now that the incident has been contained, and the multiple system backups have been made, you must take a moment to assess the risk of booting the system and continuing operations. This step is the most difficult of all the steps in this guide, but the response team must decide whether the system should be disconnected from the network and totally shut down, or run as-is to monitor the compromise. It is difficult to imagine any conditions under which it would make sense to continue to operate a compromised system. I can think of only two cases where an organization could consider reconnecting the compromised system to the network and continuing operations. The first case requires the certain determination that the root account was not compromised. Although still serious, it is much easier to contain a non-root compromise. As a result, continuing operations may be a reasonable course. The second case relies upon the use of a recent tripwire database of the "clean" system binaries. Here you might have been able to determine exactly which files had been altered. This permits much faster recovery. It is important to remember that the intruders got into the system somehow. If you are sure you know how they got in (perhaps from packet traces or system logs), and you know that the vulnerability has been fixed, then you can probably reconnect the system to the network and put it back into operation. However, if the risk to the organization or data is too great, the on-site team must recommend that operations cease until the site is cleaned up and restored to health. If, after examining log files and other sources of information, the attack is determined to be external to the organization (i.e., a DoS attack targeting the system), then the on-site team should recommend breaking the connection temporarily at the organization's firewall. If the attack is internal to the organization, the team should recommend isolating the affected subnet at the router level. If the system is the originator of the attack, the recommendation should be to pull the machine off the network, collect as much data as possible then restore the machine to a healthy state (either by totally reinstalling or going back to a known clean backup). Depending on the data already collected, the consensus might be to leave the machine as is for a short while to collect more data or possibly trace the attack. The team should be aware that they need to take into consideration other systems on the same subnet and other trusted systems that regularly connect to the affected machine before making their decision. The on-site team only provides a recommendation and presents the data to justify the recommendation. They can also provide a "what-if" scenario to the Command Post Team to better explain why their recommendation should be accepted and acted upon. The Command Post Team makes the actual decision itself. Step 3.6 Check The Neighboring Systems And Trust Relationships Always keep an eye on the unseen trust relationships. Who mounts (or exports) files via NFS? Who has entries in their .rhosts, .shosts, or hosts.equiv? Who has a .netrc from that host? Who shares any network segment with that host? These are your first ring of next targets to verify, then work out from there. Typically an attacker doesn't compromise just one computer, they hop from host to host, and attempt to hide their tracks while creating as many potential back doors as possible. The remarks about unseen trust relationships can't be emphasized too strongly - but even better is to inspect systems for that sort of problem BEFORE you have a cascade of compromised systems to deal with. The examples given all can be found by running the find command on each system. In practice you can get a good idea of the extent of your problems by just running the find on a few key systems. Users who believe in .rhost files usually point them every which way. There is little excuse, though, for continuing to even allow rlogin or rsh (see the SANS ssh project). The R-services are nothing but additional doors to let unwanted people into your systems. Step 3.7 Continue To Consult With System Owners While working to contain the problem, it is very important to keep in contact with both the system owners and the users. Users always seem to need the system right then and there. They hound the owners/administrators of the system with constant availability queries. The owners/administrators must be able to communicate effectively to their users. This cannot be done without effective communication from the On-Site Containment Team. The On-Site Containment team might want to designate one person to be the contact point between the team and the machine owner/administrator. This person should keep in mind that there is also a tendency for the attack to be internalized by the system owner/administrator. Effective communication, along with helpful positive reinforcement can help lower the stress level of everyone involved. Step 3.8 Change Passwords If the intruders compromised the root account or if the system does not use a shadowed password structure, then you must require all users to change their passwords. The root password should be changed in any case. Intruders routinely copy the password file to their "basecamp" system immediately after breaking into a computer. They cannot copy the shadow passwords unless they have been able to operate as the root user, but if it is possible they will grab a copy of it too. The result is that they can then run one of the password cracking tools against those files at their leisure, and on their own system. Many users choose weak passwords that are easily guessed. Another threat of the compromised system includes the use of a network packet sniffer. Intruders will also often install these on a compromised system in order to collect clear text user names and passwords. Note that the only defense against this attack is to use encrypted connections such as by using ssh or connections run over ssl. PHASE 4 - ERADICATION Step 4.1 Determine the cause and symptoms of the incident Step 4.2 Improve defenses Step 4.3 Perform vulnerability analysis Step 4.4 Remove the cause of the incident Step 4.5 Locate the most recent clean backup Restore reads back archives created by dump. It can be used to generate a listing or extract files. Note that it will not read back tar files nor can tar read files created by dump. Restore has both an interactive and non-interactive mode. Interactive mode permits you to "browse" though a dump file to search for what you want to retrieve. Interactive restore of /etc/hosts: cd /tmp; restore ivf /dev/nrst0 restore> cd etc restore> add hosts make node ./etc restore> extract ... Specify next volume #: 1 Extract file ./etc/hosts Add links Set directory mode, owner, and times Set owner/mode for '.'? [yn] y restore> quit Add builds list of files to search for. When extracted, the files’ full path is recreated relative to current directory and restored file is /tmp/etc/hosts. PHASE 5 - RECOVERY Step 5.1 Restore The System At this point it is worth taking a little while to look over the information your have collected. You will be anxious (and probably under pressure) to get the system back up, but by looking through the material you have collected you may discover some things about the way the intruder had set up their tools which will help you harden your newly installed system against similar attacks. Once you have gotten copies of everything that will be overwritten, reformat the system areas and reinstall the system. Carefully copy back the password, shadow and configuration files from your saved copy of /etc inspecting them carefully and correcting them before copying. Especially look for user accounts having a UID of zero and for accounts that have an empty password. Be sure there aren't extra users. Immediately change the root password. Once the system is back in operation, all other users should be required to change their passwords. In Solaris and other systems which have the capability, make sure the system is configured to require a password (in Solaris it is the PASSREQ=YES entry in /etc/default/login). Then, if possible apply all patches to the system before putting it back in service. Keep in mind that the intruders got into the system somehow and you want to close as many of the holes as you can so that they don't come directly back in. They are almost certain to try. It would be best to rebuild or reload from some known clean source all of your locally compiled applications. This is especially true of a web or mail server. For more on setting your web server securely see http://security.tsu.ru/info/www/secfaq/eng/www-security-faq.html, and especially the references given in the bibliography link on that site. CERT also maintains a document that is worth reviewing. http://www.cert.org/other_sources/websec.html is a link that points to the w3.org and apache.org security pages. Step 5.2 Validate The System Once you have installed the system and all the patches you are going to, use find to look for all the files that are owned by root and are SUID (find /sbin -user root -perm -4000 -print should do it). These and the files which are run BY root (such as from the rc scripts or out of inetd) are the ones that intruders use as attack portals. Go through the list of SUID files very carefully and turn off the SUID bit on all files that don't really require it. For example, if you don't run quotas you probably don't need (in Solaris) /usr/lib/fs/ufs/quota to be SUID. There are many similar cases. Carefully review inetd.conf and make sure you do not have any services enabled which are not absolutely needed. It is strongly recommended that tcpwrappers should be installed and configured to protect portmapper or rpcbind and any services you started from inetd. Check through the rc scripts to be sure no unneeded services are getting started there. This is just a brief list of some suggestions for making it more difficult for intruders to regain access to your system. See the notes in http://www.cert.org/tech_tips/root_compromise.html, especially the section "Improve the security of your system and network". Step 5.3 Decide When To Restore Operations Step 5.4 Monitor The Systems If at all possible, use a network-monitoring tool to watch the traffic to the system once it is back up. You may discover that the intruders are trying to get back in. From traces of that activity you will know which techniques they used and where they came from (but keep in mind that they are probably attacking you from some other compromised system). If you are attacked repeatedly, contact the responsible party at the attacking site (best by phone rather than email) and have a polite chat with them. PHASE 6 - FOLLOW-UP Step 6.1 Develop a follow-up report |
