(This page uses style sheets.)
The Administration has one or more interconnected Windows Workgroups, Windows NT PDC/BDC or Windows 2000 Active Directory domains. All users have Windows desktops. All central applications run on Windows servers.
Throughout this chapter the word Windows means a version of Microsoft Windows. Where the precise version is important it will be stated. Code examples are based on a Red Hat Linux system; other distributions may have subtle differences.
The content of this Scenario should be read in conjunction with the general comments made in previous Chapters.
To recap on what was said in Chapter 5, planning for the transition phase is very important, the success of an OSS project will be judged as much on the smoothness of the transition as on the quality of the final service. It is likely that any practical transition from one system to another will take place over a period of months or even years. During this time, data must be moved, people trained, software installed, and the business of the Administration must carry on without disruption.
Careful planning will be required, and large administrations should run a pilot phase to test the plan before putting it into effect on a large scale.
This Scenario can be divided into the following:
A group of Windows computers co-operating loosely on the network by declaring themselves to be part of the same “workgroup”. There is no security aspect to workgroups – they serve only to group machines conveniently in browser lists.
Users wishing to share files with others can make available “shares” – parts of their directory hierarchy – either for general access or with a password being required.
There is no co-ordination of usernames and passwords in this model. Indeed, with some versions of Windows there is no real concept of file ownership.
Migrating from a workgroup scheme to another will involve collecting up important files by hand, one machine at a time.
In this model, one or more computers act as domain controllers to co-ordinate usernames and passwords. One of these server machines is designated the Primary Domain Controller or PDC, and all changes are handled by that machine. There may also be one or more Backup Domain Controllers or BDCs to provide redundancy and load-sharing.
Windows NT domains usually include one or more fileservers (which may be the same machines that are running PDC and BDC functions). The fileservers provide storage for roving profiles (user desktops, documents, and settings) and may also provide “home directory” space, shared filestore, and print spooling services.
In a well-managed domain, users are normally told to keep all files in their desktop or their home directory so that no important data is held on individual PCs. Migrating data from such well-managed environments to new systems is relatively simple, as the system administrators know where to find all the files that matter.
The Windows NT domain model becomes very hard to manage effectively for large numbers of users, so Windows 2000 introduced a hierarchical domain model. This is known as Active Directory or AD and it makes use of ideas from both the Internet Domain Name System (DNS) and the Lightweight Directory Access Protocol (LDAP).
As in Windows NT domains, AD usually provides fileservers to hold roving profiles and home directories so it should be simple to find the important files when planning the migration process.
Because AD allows LDAP access, there are more migration options available to a site that uses AD. For example, it should be possible to use the AD servers to hold username and password data for OSS servers and clients: this could be convenient where a small part of the total user base is to be moved to OSS, as the user management process can be left almost unchanged.
The two main routes considered here are:
Add OSS machines to existing Windows domains and gradually move data and users across, followed by removing the old proprietary servers. It is possible to migrate clients and servers independently.
Adding servers to the Windows domain is one of the fastest ways to gain benefit from OSS. For instance the combination of the GNU/Linux Operating System with Samba gives a powerful and low-cost file/print server that can be used in place of an Windows system without any changes to the existing client environment.
Running OSS clients in a Windows domain is a low-risk form of co-existence, as no changes to servers are required. It can be used where a small number of people will be using OSS desktops in an otherwise Windows-only environment.
Build a parallel OSS-based infrastructure and migrate users and their data in groups, with minimal interaction between old and new systems.
This is much simpler than running a mixed Windows/OSS system, but it does make co-operation between people using Windows and those using OSS systems rather more difficult.
Both routes are summarised in diagrams below. The first route provides tighter integration between old and new systems during the transition, but requires significantly more planning and implementation effort.
A constraint on the choice of route will be the way in which the Administration is organised and how this maps onto the logical and physical structure of the computing installation.
The early stages of most migration routes include a co-existence phase, where both Windows and OSS systems are in use often accessing the same data. These can be particularly useful models where a partial migration is planned, with some groups moving to OSS and others staying with the old system.
The technical details of doing these changes is in Section 14.6 onwards. But first we discuss the technical background and the tools needed.
There are many similarities between current proprietary systems and the Open Source systems that could be chosen to replace them. In particular, graphical user interfaces (GUIs) have tended to converge on a fairly standard “look and feel” which reduces problems for end-users moving from one system to another. End-user training will still be required, to help people deal with the things that are different and to get the best out of the new system.
Behind the similar appearance of the GUIs, there are some important differences between Windows and OSS systems. These are particularly apparent at the system administration level. It is here that most training and planning will be needed. OSS systems such as GNU/Linux do have management GUIs, but large installations are more commonly managed using command-line tools, as these lend themselves to scripting, process automation, remote management and advanced control. It is this ability to automate tasks that makes Unix and OSS system administrators so productive.
In addition to the differences in management processes, there are also some important differences in the service provided. These must be planned for and dealt with during the transition.
Computer users identify themselves using usernames and passwords. In some Administrations they may also use smart cards or other cryptographic devices to provide stronger proof of identity.
Some Administrations may have “structured” usernames that encode information about the user. For example, the username cfg27 might belong to the 27th person to be registered in the Corporate Finance Group. Others allow people to choose their own usernames, or simply use their real name. Structured username schemes can normally be used in OSS systems without change. OSS usernames cannot start with a numeric, which can cause difficulty with structured usernames where the initial structure could well be numeric.
There are some issues that might affect the more ad-hoc systems. Usernames in Windows systems are generally case-preserving and case-insensitive. This means that if a person is given the username “Mary”, she can type “mary”, “MARY”, or even “mArY” at login time without problems. It also means that whenever the system displays a username (such as as the owner of a file) it will use the form originally typed by the administrator when the username was created – in this case, “Mary”.
On the other hand, usernames in Unix and OSS are case-sensitive. The user must type their username exactly in the form that was originally registered. Conventionally, usernames are made up entirely of lower-case letters and numbers with no other characters used, and with a maximum length of eight characters.
These restrictions have been greatly relaxed in recent years, and modern systems will permit much longer usernames with a wider character set. Certain authentication and authorisation schemes now implement case-insensitive usernames: the LDAP-based scheme proposed in this document is such a scheme, so usernames like “Mary” and “FinancialController” are quite possible. Care should be taken though, as there may be older packages around that make assumptions based on the old rules. In particular, it would be very unwise to allow spaces or certain other punctuation characters in usernames.
It would be good practice to limit usernames to use those characters allowed in mail names so that login usernames can be used as mail names as well.
Modern OSS systems allow passwords of almost any length, and permit a very wide set of characters to be used. It is good practice to encourage the use of long passwords (10 or more characters) with a good mix of letters, numbers, punctuation and varied capitalisation. The password setting utilities generally refuse to set very weak passwords unless forced by an administrator, and many sites may decide to enforce even stronger rules.
Some commercial Unix variants still truncate passwords to eight characters so if a mixed environment is planned, this should be taken into account.
Migrating passwords from existing proprietary systems to new OSS systems is not always possible, as passwords are usually held in an encrypted hashed form. The transition plan may have to include the reissue of passwords to all users, or possibly a password collection and synchronisation phase.
Any network of more than a handful of computers needs a networked naming and authentication service. In Windows NT this is known as the Domain Controller. In later Windows systems it is Active Directory. Novell NDS is also widely installed, and other proprietary systems have their own naming and authentication systems.
Most Unix and OSS systems can interwork with almost all common naming and authentication services. GNU/Linux is particularly strong in this respect. The service proposed in this document is based on LDAP, but it is also possible to use multiple naming and authentication systems at once, which may be useful during the transition phase.
A very important part of any transition plan concerns the migration of data from the old system to the new one. If a “big bang” migration is planned then this will be a one-off operation, but if in the more likely instance parallel running is envisaged, then cross-platform file access will be necessary. Great care must be taken to avoid data loss, and to avoid the confusion that could result from having separate modifiable copies of a file in both “old” and “new” environments.
This is the most obvious migration issue, and is dealt with in detail in Section 14.8 below. The normal approach is to use OSS applications that can read files written by the proprietary application that they replace, though in some cases it may be appropriate to plan for a bulk format conversion as part of the migration process.
Special data such as macros and scripts are likely to need attention from skilled programmers during the migration.
As with usernames, Windows filenames are case-insensitive and (to some extent) case-preserving. Some Windows applications do tend to capitalise the first letter of filenames and do other changes that the user may or may not be aware of. The Windows environment also carries the legacy of the DOS “8.3” filename format, which still shows up in some utilities. Windows filenames commonly contain spaces, and normally use the Unicode character set. Windows uses “\” as the directory separator.
Although it is less obvious to GUI users, Windows absolute filenames must include a “drive letter” indicating the physical device that holds the file, or they must have the actual name of the server if the file is on a “network drive”. These restrictions can be a problem to managers of large Windows systems who try to provide a seamless service in the face of hardware changes.
Other proprietary systems treat filenames in different ways. VMS, for example, has case-insensitive filenames that usually include one dot and may include a version number after a semicolon.
Filenames in Unix and OSS systems have different rules. Here, filenames are fully case-sensitive and the system does not make any changes to names supplied by the user. Names use an 8-bit character set determined by the locale currently in use (in most of Europe, the character set is ISO 8859-15). The only characters GNU/Linux does not permit in filenames are the directory separator (“/”) and the null character. However in practice it is unwise to include non-printing characters, for example the Windows FAT32 filesystem cannot store the first 32 ASCII codes or any of ", *, :, <, >, ?, \ or |. Spaces in filenames are permitted, though their presence does require command-line users to be more careful with quoting.
Unix and OSS systems do not use drive letters, and do not require the real name of the fileserver to be part of the absolute filename where network file access is used. Instead, the system presents all files as part of one seamless hierarchy. Together with the use of symbolic links in the filesystem and data-driven automounters, this gives system administrators great flexibility in separating the absolute name of a file from its physical storage location.
Almost all Windows filenames can be migrated directly to OSS servers without change. The only exception likely to be encountered in practice is filenames containing the “/” character, which will need to be modified during the transition. Users of GUI tools will probably never notice that filenames have become case-sensitive as they only ever type such names when first creating the file.
Many migration plans are likely to include a period of parallel running where some people are using OSS systems and others are still using proprietary ones. Where files need to be accessed by members of both groups, special provisions may be needed.
File sharing in Windows systems uses the SMB (Server Message Block) protocol, which is a very complex technology with multiple levels of backwards compatibility. It is used with dedicated fileservers and also in “peer-to-peer” mode, where individual PCs make available parts of their filesystem on the network. Well-managed Administration environments are likely to be based on dedicated servers rather than ad-hoc sharing.
Non-shared user files in a Windows environment may be held in several different places:
On a local disk of the user's desktop or laptop PC – for instance the one referred to as “the C drive”.
In the user's “roving/roaming profile” – this includes most preference settings and also the content of the Windows desktop and (usually) the “My Documents” folder. The roving profile is held locally on whatever PC the user is actively using, and is synchronised back to a profile store at logout time. This provides a handy backup facility, but can have serious performance implications with users reporting very slow logouts.
In a “home directory” on a centrally-managed fileserver. This is a common option for large networks of desktop systems as it is easy to manage backups properly.
It is not sensible to try to provide dual access to files held on individual desktop or laptop PCs, so any files on local disks or in roving profiles should be moved to managed fileservers at an early stage in the migration process.
The main networked file access mechanism for Unix and OSS systems is the Network File System (NFS). This is a much simpler protocol than SMB, and its specification has always been openly available.
Options for implementing dual access fall into two broad categories: adding dual-protocol support to the servers, or adding dual protocol support to the clients. Other things being equal, it is normally easier to change servers than clients, and almost always easier to adjust OSS systems than proprietary ones. The options are summarised in the table:
|
|
Windows servers |
OSS or Unix servers |
|
Windows clients |
SMB file access is standard. |
Servers support SMB using the Samba package. This is mature software with excellent performance. |
|
OSS Clients |
GNU/Linux clients can access SMB shares. This is fairly easy where client machines have just one user at a time, but gets more involved where time-sharing machines are used. Commercial Unix variants do not normally have SMB client capabilities. It is possible to add NFS service to Windows servers, but this can be very expensive. |
NFS file access is standard. GNU/Linux clients can use SMB if desired as part of a migration plan, but this is less efficient. |
This section discusses some of the key OSS components that will be used when migrating from proprietary systems.
Samba is a file and print server package for OSS systems. It implements Microsoft's SMB protocol and in many cases can completely replace the functions of a Windows server. Samba can also act as an Windows NT Domain Controller and is capable of storing domain management data in a directory accessed using LDAP.
Samba also provides SMB client tools suitable for scripting, which are very useful when diagnosing problems with SMB networks and also when doing bulk file migration from Windows servers.
Samba is maintained by a core group of about 30 very active volunteers around the world. More information can be found at http://www.samba.org/.
OpenLDAP is an implementation of the Lightweight Directory Access Protocol (LDAP). It includes a directory server, a set of data access and management tools, and a set of libraries to support LDAP in other applications.
OpenLDAP is maintained by a small core group plus a large number of contributors. One of the core group works on the project full time.
NSS is the Name Service Switch: a technology used by GNU/Linux and some commercial Unix variants to allow different name services to be used when looking up hostnames, usernames, groupnames etc. Many modules are available, of which the ones most relevant to this project are:
files: Simple lookups based on local text files
DNS: hostname lookups based on the Domain Name System
LDAP: lookups based on LDAP – mostly for usernames and groupnames but also usable for many other purposes.
SMB: lookups using Windows' SMB protocol (See 14.5.5 below)
PAM is the Pluggable Authentication Module system. Like NSS, it is common to GNU/Linux and several commercial Unix derivatives. PAM is also available on FreeBSD. PAM allows great flexibility in configuring the authentication and authorisation process. Relevant modules include:
LDAP: uses LDAP bind operations to verify user credentials.
SMB: Uses Windows NT domain operations to verify user credentials.
Access: restrict access to networked services.
Cracklib: enforce quality checks on new passwords.
Samba allows an OSS system to provide a file service to Windows clients. SMBFS works the other way around: it allows an OSS system to access files held on Windows servers. SMBFS is provided with most recent GNU/Linux distributions but is not normally found in commercial Unix systems.
The access-control model used by Windows filesystems is different from that used by GNU/Linux and other OSS systems, so there are some limitations in what can be achieved with SMBFS.
Another product from the Samba team, Winbind allows individual GNU/Linux machines to be attached to Windows NT domains. It maintains a mapping between Windows NT authenticators (SIDs) and Unix-style UIDs and GIDs. Winbind can do many other things that reduce the load on system administrators, such as setting up Unix-style environments for people when they first log in.
The disadvantage of Winbind in large networks is that each client computer builds its own mapping between Windows authenticators and Unix ones. This can cause problems in the later stages of migration when OSS fileservers are introduced.
When using Winbind, usernames and group names used by GNU/Linux are formed by concatenating the Windows NT domain name with the Windows NT username to form a unique string. This can lead to some confusion, as many common Unix-style utilities only provide space in their output for eight-character usernames. The longer names generated by Winbind get truncated in the display.
Setup is extremely simple:
Install a GNU/Linux server, giving it a fixed IP address.
Make sure that the Samba packages are installed, typically samba, samba-common and samba-client are required. These are normally included in a “server” installation.
Edit /etc/samba/smb.conf, set domain security mode, and define the domain (workgroup) name. List the PDC and any BDCs as password servers. Define the shares that are to be served by the machine.
Create any directories that are to be shared, and set appropriate ownership and permissions.
Join the machine to the existing Windows NT domain using the Domain Admin password (or any other username and password that has the power to do this):
smbpasswd -j DOMAINNAME -r PDCNAME -U Administrator
Start samba and arrange for it to restart at reboot:
/etc/init.d/smb start
chkconfig smb on
The server will now show up in browse lists and can be used just like an Windows NT server.
In the early stages of testing OSS tools it is very useful to run individual GNU/Linux machines with very simple configurations. These can be given access to files on Windows servers for compatibility and migration tests using the smbmount command.
Mounting is the Unix/OSS term for making a disk or remote filesystem part of the local machine's file hierarchy. The process is usually done automatically at boot time under the control of the /etc/fstab file, but can also be done interactively. For example, the command to bring an ISO-standard CDROM into the filesystem under /mnt/cdrom would be:
mount /dev/cdrom /mnt/cdrom
The mount command is normally restricted to use by the root user for security reasons. This is not a problem where the machine is being used by a system administrator, but can be awkward where a non-technical user is involved. GNU/Linux provides several ways around this problem:
Use a special entry in /etc/fstab that allows ordinary users to mount certain pre-defined objects. This is the usual way to allow CDROMs and floppy disks to be mounted on demand. Files on the mounted device usually show up as being owned by whoever caused the device to be mounted.
Use a setuid-root program to do the privileged operation, having first checked that it is safe. This is the easiest way to handle the mounting of remote Windows shares.
Use an automounter to mount filesystems when they are first accessed and to unmount them when no longer in use. The automounter runs as a daemon and is usually driven by network-wide configuration data. This takes more effort to set up than the other methods, but is extremely useful in large networks.
In this scheme, we will use the smbmount and smbumount commands to make an existing Windows share appear as part of the local GNU/Linux filesystem. On Red Hat Linux systems these are part of the samba-client package, so make sure you have installed both the samba-common and samba-client packages. These programs are designed so that critical parts can be given root privileges, although they are not normally installed that way by default so a few commands must be run by root before they are first used:
chmod u+s /usr/bin/smbmnt /usr/bin/smbumount
Note that the command changes smbmnt rather than smbmount. This is important because smbmnt encapsulates only the functions of smbmount which require root privileges. Having done this, any user can use smbmount and smbumount and they will run with the necessary root privileges.
Now, any user can make a Windows SMB share available as part of the GNU/Linux filesystem by mounting it on a directory that they already own. Any files that are already in the directory will be invisible while the SMB share is mounted on top.
As an example, suppose the GNU/Linux user fred wants to access files on a Windows NT server called NT4SERVER in the domain THESTATE, which are shared under the sharename FREDERICK and owned by the Windows user FREDERICK. fred starts by making a new directory to mount the Windows share onto:
mkdir ~/ntfiles
(The notation “~/” means “in my home directory”.) This only needs to be done once. Now, to mount the remote share:
smbmount //nt4server/frederick ~/ntfiles \
-o username=frederick -o workgroup=thestate
The command should be typed on one line, or split up with “\” line-continuation characters as shown here. It will prompt for FREDERICK's password on the server, and then mount the Windows share on top of the directory ntfiles in fred's home directory. To avoid typing the whole thing at every login, it can be put into a script file or even made part of fred's login process.
The mounted share now behaves almost as if it were part of the local disk. Files can be created, deleted, and edited. There are some caveats though. In particular, there is no attempt to map between Unix-style access control and Windows NT ACLs so commands to change the ownership or mode of files and directories on the mounted share will have no effect.
Before logging out, it would be wise to unmount the share:
smbumount ~/ntfiles
Again, this could be made an automatic part of the logout process if required.
The process described in this section does not create any permanent link between accounts on GNU/Linux and accounts on existing Windows NT servers, so usernames and passwords must be maintained separately on each machine. The management effort involved can quickly become excessive as the number of machines grows, so this scheme is really only suitable for small test environments.
Where a larger pilot deployment of OSS desktop systems is required, it may still be convenient to keep file and authentication services on existing Windows NT servers. Samba's Winbind daemon provides an easy way to link the two environments.
Samba and Winbind are standard parts of the Red Hat Linux distribution, but they may not be installed by default on workstation setups. To use Winbind, the following packages should be installed: samba, samba-common and samba-client.
The file /etc/samba/smb.conf must be edited to show the correct Windows NT domain name in the workgroup line, and to put the system into domain security mode. Winbind configuration data also goes into the global section of this file, for example:
# separate domain and username with '+', like DOMAIN+username
winbind separator = +
# use uids from 10000 to 20000 for domain users
winbind uid = 10000-20000
# use gids from 10000 to 20000 for domain groups
winbind gid = 10000-20000
# allow enumeration of winbind users and groups
winbind enum users = yes
winbind enum groups = yes
# give winbind users a home directory location
template homedir = /home/winnt/%D/%U
# and a shell
template shell = /bin/bash
For Winbind to work, certain services need to be running. To start them and to ensure that they start at each reboot, the commands are:
chkconfig smb on
chkconfig winbind on
/etc/init.d/smb start
/etc/init.d/winbind start
The machine must now be joined to the Windows NT domain. This requires a Windows NT username and password with the appropriate permissions (usually Administrator):
smbpasswd -j DOMAINNAME -r PDCNAME -U Administrator
It should now be possible to get lists of Windows users and groups with the wbinfo command:
wbinfo -u
wbinfo -g
To make the Winbind data available to the system, it is necessary to edit PAM and NSS configuration files. This should be done with great care, as it is possible to find oneself locked out of the system if these files are damaged. In /etc/nsswitch.conf add the word winbind to the passwd and group lines. In /etc/pam.d/system-auth add a line of the form:
auth sufficient /lib/security/pam_winbind.so use_first_pass
just after the equivalent auth line that uses pam_unix, and one of the form:
password sufficient /lib/security/pam_winbind.so use_first_pass
just after the equivalent password line that uses pam_unix.
It will be necessary to restart the Name Service Cache Daemon at this stage:
/etc/init.d/nscd restart
The translation of Windows usernames and groups into Unix-style passwd file format can now be seen with:
getent passwd
getent group
To automate the creation of user home directories on first login, add this line to the session part of /etc/pam.d/system-auth:
session required /lib/security/pam_mkhomedir.so skel=/etc/skel/ umask=0022
(Ensure that the above is entered as a single line rather than the two lines as which it is presented here.) Note that this will create a separate Unix home directory for the user on each workstation that they use. It may also be useful to put a script into the /etc/skel directory to cause each user to automatically mount their Windows NT files in a standard place at login time.
In principle, GNU/Linux desktop machines can join AD (Active Directory) domains in much the same way that they join Windows NT domains. Indeed, if the AD domain is running in NT-compatibility mode then exactly the same process can be used.
AD domains also offer the possibility of using LDAP for authentication and data lookup. This is the same scheme that is proposed for larger networks of pure OSS systems, and is well worth considering. By extending the AD schema to include Unix data, it would be possible to manage the users of OSS desktops and servers with AD administration tools. Storing the data centrally is preferable to the Winbind scheme used with Windows NT, as it keeps the mapping between Windows NT IDs and Unix IDs consistent across all machines.
Samba can take on the role of the Primary Domain Controller, thus allowing all Windows servers to be eliminated even if some Windows clients are still required. Note that it is not possible to replace just the PDC or just a BDC in a domain: all domain controllers must be running the same system – either Windows or Samba. This is partly because the PDC-BDC replication protocol has not been reverse-engineered. Also, Samba Domain Controllers take a different approach to resilience: they delegate it to the LDAP servers where the data is actually stored.
Setting up a Samba+LDAP Domain Controller is too large a job to describe in detail here, but it can be done in a day or so by an experienced person. The larger task is planning the migration of usernames and groupnames from an existing domain. Some of the work is covered in the Samba-LDAP-HOWTO from IDEALX (see references in Section 14.12 below). The same source provides a set of migration tool skeletons that can be a very good base to build on.
In summary, the process is:
Install OSS server(s) with Samba and OpenLDAP. It may be necessary to build Samba from source, for instance Red Hat Linux 7.3 did not include the LDAP-enabled version.
Add the Samba schema definitions to the LDAP server.
Set up the LDAP server with an appropriate base Distinguished Name (DN) and directory tree structure (possibly using the tools from IDEALX to populate the tree with boilerplate entries).
Start Samba and test the Domain Controller function.
Use pwdump on the PDC to list all user entries in the SAM. Transfer the result as a text file to the OSS server.
Configure the IDEALX smbldap-migrate-accounts.pl tool to match the environment being built. This is non-trivial as there are a lot of options to consider.
Run smbldap-migrate-accounts.pl on the data transferred from the PDC. This will create entries in LDAP for all domain users. It will also set their SMB passwords to match the ones used under Windows NT (but this will not enable Unix or GNU/Linux logins, as the Windows NT passwords are hashed and a different hash scheme is used for OSS systems).
The tool can create home directories at the same time if desired.
Copy user files and roving profiles from Windows servers to the new OSS servers, or re-bind the existing Windows servers to the domain now served by Samba Domain Controllers.
Large networks are likely to need multiple LDAP servers with data replication for resilience. If one Samba Domain Controller is associated with each LDAP server, a scheme very much like the Windows PDC/BDC setup can be realised.
There are many other issues to be considered, such as:
Choice of tools for user management
How Windows NT groups and ACLs will be mapped to Unix-style groups and ACLs
Whether to use a new domain name for the OSS-based service
How to create password hashes usable by OSS systems (or whether to continue using Windows NT or LANMAN hashes, even in a pure OSS environment)
The bulk of the data held by an Active Directory is in an LDAP-accessible store. At first sight, this should make it easy to replace AD servers with OSS equivalents. Unfortunately this is not the case: Windows 2000 systems do not use pure LDAP for all data access, and they use a non-standard variant of Kerberos for authentication.
Several OSS teams are working to fix the problem, but at the time of writing the only feasible way to support Windows 2000 and Windows XP clients is to run them in Windows NT domains as described above.
This is the simplest of all possible migration schemes. The interaction between Windows and OSS systems is limited to the one-off transfer of user files. In outline, the process is:
Build the core OSS environment. This will include LDAP servers to hold configuration and username data, master installation servers, one or more file and print servers, and enough client workstations for systems management staff.
Build a development and training facility, with enough desktop workstations to allow training of appropriate-size groups of people. The initial job of this facility is to validate and fine-tune the workstation setup before the main rollout.
At this stage, the workstation build process should be finalised so that machines can be set up with minimal human effort. It is very important that all desktop machines are installed in exactly the same way during the main rollout phase so this should be tested carefully.
Use the development and training facility in consultation with key representatives of the user base to generate enthusiasm for the project and to gather feedback on the user interface. Make changes as needed to arrive at the “rollout image”.
Agree the training requirements and schedule.
Build a set of new desktop workstations sufficient to replace the equipment currently being used by the first group that is to migrate to OSS systems.
Register the first group of users on the new system.
Train the first group of users on the new system.
If necessary, re-set any configurations changed during training so that everyone starts with a known environment.
Replace the first group's desktop PCs with the pre-built OSS systems.
At the same time, copy the group's files across to the new fileservers and set the original copy to be read-only.
Provide active support to the first group while they get used to working with the OSS system.
Upgrade the PCs removed from the first group as necessary, and install the standard workstation image.
Repeat from step 5 with the next group of users.
When all users have migrated to OSS systems, make archive copies of all files on the old servers and decommission them.
Where some Windows clients have to be retained (for example to support functions that are uneconomic to migrate due to non-portable software) there are two main options:
Retain a small Windows domain using one or more Windows servers.
Support the Windows clients from OSS-based servers using Samba.
The route chosen will depend on why the Windows clients are being retained, and on their geographic distribution.
In either case it is likely that Samba will be needed on one or more of the new servers, to provide file sharing between the Windows clients and the OSS-based ones.
The usual Windows web server is IIS (Internet Information Server), which provides HTTP, FTP, and Gopher services in one package. IIS has a reputation for problems with security and stability, which has prompted many organisations to replace it with an alternative. Indeed, after a series of particularly serious flaws was exploited in 2001, analysts at Gartner issued a strongly worded advisory to their customers suggesting that IIS should not be used for critical functions until it had been completely re-written by Microsoft.
There are a number of web servers to choose from when looking to replace IIS. Many of these are OSS or have very liberal licence conditions. Some of the more widely used servers are discussed in Section 11.4.2 above.
When migrating sites from IIS, the usual choice is Apache – often with PHP or Perl modules for scripting. Apache runs on GNU/Linux, FreeBSD, almost all other Unix variants, and also on Windows. This provides a wide choice of migration options.
When moving a simple website from IIS on Windows to Apache on GNU/Linux or Unix, the main issue to be aware of is that the Windows filesystem ignores letter case in filenames, but most GNU/Linux or Unix filesystems are case-sensitive. As the hierarchy of web pages is normally represented directly in the filesystem, this means that URLs become case-sensitive when moved to the Unix or GNU/Linux environment. (This would not be an issue if Apache were used on a Windows server).
A less common issue is that IIS seems to accept both “\” and “/” as component separators – it must translate “/” to “\” for the Windows filesystem, but it appears to allow “\” to work natively. Thus, a file could be referred to in a URL as both mydir\thisfile.html and mydir/thisfile.html.
Neither issue will affect a correctly-written and self-consistent website. Unfortunately, sites built with Windows software often have inconsistent use of upper and lower case, and sometimes have the “\” character in URLs where the website file structure contains a subdirectory. Indeed, the example website distributed with early versions of IIS displays both of these issues. There are easy workarounds for both problems in Apache, which are demonstrated in the example later in this chapter. As a general rule though, it is better to correct such problems in the website data.
Some early websites used a server-side mapping from x,y co-ordinates in an image to destination URLs. This is now deprecated because it is inefficient and does not work well with non-GUI browsers, but some sites may still use it. Server-side maps in IIS take the form of files with the “.map” extension, and their format is not compatible with the equivalent Apache files.
The best approach is to convert any server-side maps into client-side maps as this also provides a better browsing experience for the user. If this is not possible, a simple Perl script can be used to edit the files into a form usable by Apache.
More complex sites are likely to have dynamic pages based on scripting and database access. Most IIS sites use ASP (Active Server Pages) as the scripting framework, and might use Access or SQL Server for the database, depending on the size of the application.
There are many ways to handle the migration of ASP scripts. Some of the more popular ones are:
Chili!Soft ASP package for Unix (now called Sun ONE Active Server Pages)
ASP2PHP
Apache's Apache::ASP module
Manual conversion to a new language
Chili!Soft ASP is a proprietary product, but in some cases it could provide a very cost-effective migration route.
ASP2PHP is a standalone script converter which converts text files written in ASP and VBScript into text files written in PHP. Support for ASP files using JScript is under development. PHP is a very popular web-scripting framework with similarities to ASP, so developers should find it a fairly easy transition to make. For larger projects it is often better to have a greater separation between page design and script logic than the ASP or PHP models allow. In these cases, a manual conversion using a templating system might be a better choice.
Apache::ASP provides ASP-like features directly through the Apache framework, along with scripting in Perl. VBScript and JScript are not supported.
In some cases it may be best to consider a manual conversion from ASP to a new framework. This allows the greatest flexibility, and complex sites may well benefit from moving to a template system such as Template Toolkit (http://www.tt2.org/).
All the Apache scripting systems have database access facilities for a wide range of database types (SQL, flat file, indexed, LDAP, NIS etc) so data-driven dynamic sites of any complexity can be built.
The FrontPage web-design package introduced a set of extensions to allow remote management of web content. These have since been used by some other web-design packages.
The FrontPage extensions are available for Unix systems, but are not universally popular with Apache administrators for a variety of reasons including security issues and the large number of changes they introduce to the standard web-page storage area.
A standards-based replacement is now available in the form of the WebDAV protocol (RFC2518). It is supported by most web servers (including Apache, using the mod_dav module) and is now the preferred website management protocol. Microsoft has supported WebDAV in their Office suite since Office 2000, and it can also be accessed directly using Windows Explorer, so a Linux/Unix/Apache server can support both OSS and proprietary clients using the same mechanism.
This example shows the complete process for migrating a simple static website from IIS on Windows NT to Apache on GNU/Linux.
Prepare the GNU/Linux server, connect it to the network, and test Apache. Most GNU/Linux distributions provide pre-configured Apache packages, so this is normally straightforward. An Internet-visible server will undoubtedly need its security tightened before being connected.
Locate the website data on the IIS server (usually in C:\InetPub) and make a copy of it ready for transfer, for example using a Zip archive package.
Copy the Zip file to the GNU/Linux machine (for example using FTP) and unpack it in the location chosen for the website data. This is configured as DocumentRoot in Apache's httpd.conf file, and is usually somewhere like /var/www/html.
Edit httpd.conf and add default.htm to the DirectoryIndex clause. (By convention, Apache is configured to look for default/home pages named index.html, while IIS uses default.htm – this action allows either name to be used.)
At this stage, the site should start to work, though it must be accessed by the name of the new server rather than the proper URL. It may also show problems where the site data has inconsistent use of upper and lower case in filenames and URLs, and where “\” has been used in URLs.
If possible, test the site at this stage and correct any problems by editing the site data. This will give the best performance. There are automated checkup tools available that will traverse the site and tell you if any links point to unavailable locations. You could also make a list of unreachable pages at this stage, and run every page through an HTML checker.
If fixing the site data is not feasible, add these configuration lines to httpd.conf:
LoadModule speling_module modules/mod_speling.so
AddModule
mod_speling.c
CheckSpelling on
Note that this causes a directory scan and an HTTP redirect for each misspelled/miscapitalised part of a URL, so watch out for performance issues.
Pages incorrectly using “\” in URLs can be handled using mod_rewrite, by adding these lines to httpd.conf:
RewriteEngine on
RewriteRule ^(.*)\\(.*)$ $1/$2 [N]
This replaces the first \ with / in the URL and then repeats in case there was more than one \.
Check for server-side image maps using a command of the form:
find /var/www/html -name '*.map' -print
Edit by hand if there are just one or two, or use a script to fix them if there are many to do.
At this stage, the whole site should work correctly. You may want to set up FTP, Samba, or WebDAV to provide access for updating pages.
To bring the site into production, either disconnect the old server and change the IP address of the new machine to replace it, or change the DNS entry of the website to point to the new server.
WebDAV can be used to manage the content of some or all of your website. In this example it is used for the whole site, so no other access should be permitted. (Other management systems such as FTP or direct file access will confuse WebDAV clients as they do not use the same locking scheme.)
Make a directory for WebDAV locks. It should be owned by the same user and group that Apache runs as (see the User and Group configuration options in httpd.conf). A good choice would be /var/httpd/webdavlocks.
Add these lines to the main part of httpd.conf:
Loadmodule dav_module libexec/libdav.so
Addmodule mod_dav.c
DAVLockDB /var/httpd/webdavlocks
Find the Directory or Location section associated with the default website, and add lines like this:
DAV On
AllowOverride None
Options Indexes
AuthType Basic
AuthName "Website Managers Only"
AuthUserFile /var/httpd/htpasswd
<LimitExcept GET HEAD OPTIONS>
require valid-user
</LimitExcept>
Make sure that the associated files and directories are owned by the same user and group that Apache runs as, using a command of the form:
chown -R apache:apache /var/www/html
Create the password file:
touch /var/httpd/htpasswd
chown root:apache /var/httpd/htpasswd
chmod 640 /var/httpd/htpasswd
Create a password for a user called webadmin (or any other name you choose):
htpasswd -m /var/httpd/htpasswd webadmin
Either restart Apache or have it re-read its configuration files, for example:
/etc/init.d/httpd reload
You can now manage the whole site using the WebDAV protocol. Windows 2000 and later clients can access it as a “Network Place” in Windows Explorer, and Office applications can save data directly to the site. GNU/Linux provides similar functions via davfs.
Note that the scheme described here provides only limited security. You should read the Apache manual for more details on user authentication and choose an appropriate scheme for your needs. It may be necessary to use SSL to secure the transactions; this can be done with Apache's mod_ssl.
Many small database projects on Windows use Access. This is an attractive product for many people because it is fairly simple to get started, and it has a familiar user interface. Access has severe limitations though; it was not designed for heavy multi-user work, and it cannot cope with large datasets.
Larger databases might use SQL Server, or one of the well-known relational databases: Oracle, Sybase, DB2 etc. In the case of these larger systems, it may be best to leave the database running on the existing platform and just migrate the client applications to OSS platforms. This is particularly appropriate where the Administration has in-depth skills for the existing database and is using many proprietary features. There are standard ways to connect to relational databases across the network, so the choice of platform can be different for the database and the client applications. Also, most of the non-Microsoft proprietary databases are available on GNU/Linux and Unix platforms, so it is possible to change the operating system without having to learn a completely new database.
On the other hand, proprietary databases can be very expensive items so it is worth considering whether an OSS product could do the job effectively.
The two best-known OSS databases are MySQL and PostgreSQL. Both are mature products with large installed bases and active development teams. Both have good support for standard SQL, and are capable of very good performance.
It is also worth remembering that databases do not have to be relational. Some tasks fit better with other models, and direct use of an OSS product like Sleepycat's Berkeley DB can be extremely efficient. Similarly, the LDAP model of hierarchical networked databases is very suitable for some types of distributed application.
Access is only available on Windows platforms, so all such databases must be migrated to some other package if a completely OSS environment is planned. An interesting and useful intermediate scenario involves migrating the data to an OSS database but continuing to use Access as the front-end. This has the desirable property of removing many of the restrictions and problems of the Access data-store.
There are several ways to migrate data from Access to other databases. For simple datasets, perhaps the easiest way is to export the tables from Access as CSV (Comma Separated Values) files and then import these to the new server. This method does require the tables to be created by hand on the new server first, but it does not need any special software.
As an example, here are the commands to create a database with a simple table and import a CSV file into MySQL. First enter at a shell prompt:
mysql --user=myusername -p
Then input the following:
create database mydb;
use mydb;
create table mytable (
firstname char(30),
surname char(30),
postcode char(10)
);
load data local infile 'exportfile.csv'
into table mytable
fields terminated by ',' enclosed by '"'
lines terminated by '\r\n';
Several scripts and programs exist that will export an Access database complete with all information necessary to re-create the tables in another DBM. Some of these produce files to be copied to the new platform, while others connect directly across the network and make the changes immediately. An example of the file-writer scripts is exportsql2.txt available from http://www.cynergi.net/exportsql. This produces files with DROP TABLE, CREATE TABLE, and INSERT statements that will replicate the Access database in MySQL.
A number of other migration tools are described in Paul DuBois' paper Migrating from Microsoft Access to MySQL (http://www.kitebird.com/articles/access-migrate.html).
Once the data has been migrated, it is possible to continue using Access as a front end by deleting the tables locally and linking to the newly-created tables on the MySQL server.
The process here is much the same as described above; for simple databases it is usually sufficient to export the data to a common format (usually CSV) and then import it to the new database. More complex databases including stored procedures and triggers will need more effort, and in these cases it is well worth looking at the range of tools available to help the migration process. Some of these are OSS, and some are commercial. A few examples:
PGAdmin is free software for administering PostgreSQL databases. There are plugin utilities for it that handle migration of data from other database engines. More information is available from http://www.pgadmin.org/.
SQLPorter from Realsoftstudio – a commercial product available in several variants depending on the source and target database engine. For more information, see http://www.realsoftstudio.com/overview.php.
SQLWays from Ispirer – a commercial product supporting a range of database engines. See http://www.ispirer.com/products.
SQLyog is another commercial tool – it does management on MySQL and also handles the migration of data from other ODBC-compliant databases: for details, see http://www.webyog.com/sqlyog.
The MySQL website lists a vast range of other conversion tools: see the list available at http://www.mysql.com/portal/software/convertors/index.html.
Migrating the data is sometimes the easiest part of the job, though if data is to be accessed across the network as straightforward SQL-style tables then there is not much more to do.
The problems are most likely to come from all the ancillary utilities and scripting languages that surround any practical database. SQL itself is standardised, though almost all database vendors extend it and encourage people to use their non-standard extensions. Also there are often several different ways to achieve a given result in SQL, and the choice of which is most efficient may vary from one database to another.
Many database applications are built with application generators or form-builders. These may not work with databases other than the one they were sold with.
Both MySQL and PostgreSQL have developed enormously in the past few years, so it is important to make sure you read recent reviews when considering which to use and whether to migrate.
Exchange provides email, calendar and addressbook services. It is normally used with the Outlook client on Windows, though some installations also use Outlook Web Access (OWA) to provide basic functions through a web interface.
All the functions of Exchange can be replaced by OSS packages, often very efficiently. The problems come when trying to provide them seamlessly to Outlook clients, as the communication mechanism between Exchange and Outlook is proprietary. Outlook is capable of accessing certain open standards-based services, though in some cases the user experience is different from that found when using the proprietary protocol. As a result, it is worth deciding at the outset whether to migrate to an OSS client package at the same time as the server migration is done, given that the user population will see some changes even if they stick with Outlook. The most obvious replacement client is Ximian's Evolution.
All Exchange users will have usernames and passwords stored in the system. Recent versions of Exchange use Active Directory for this, so the notes elsewhere in this document about migrating user registration data also apply to Exchange. In summary, OSS-based servers can access registration data via LDAP, so the new servers can either use the existing Active Directory or the data can be migrated to an OSS-based data store such as OpenLDAP.
Users may have considerable amounts of stored mail, both personal and shared with other group members. There may be a procedural or legal requirement to keep a log of all mail sent and received, in which case the storage and access to this data must be considered. People with portable computers may download all their mail to the laptop, or choose to maintain a synchronised copy with the master being held on the central store.
When planning a migration to OSS-based mail services it is important to locate all stored data and make sure it will still be accessible after the transition.
Exchange can use Windows groups as distribution lists – these are the same groups that Windows itself uses for access control. This is not the usual way of maintaining distribution lists in an OSS environment, but it can be supported if desired.
If Outlook is being retained as a mail client, it will need to be reconfigured to use IMAP rather than “native” access to mailboxes.
Exchange has no export facility so data migration must be done through a client connection.
For more detail on OSS mail systems, refer to Section 11.2 and Appendix C.
Outlook users build up a personal addressbook automatically as they send and receive messages. They also have access to one or more shared addressbooks if using Exchange server. The contents of these addressbooks must be migrated to an OSS-readable form. Personal addressbooks can be exported in vCard form, which is understood by many mail clients and can be parsed by scripts for conversion to other formats if needed. Similarly, shared addressbooks can be exported and then loaded into an LDAP store.
The main problems are likely to come from the fact that Outlook and Exchange tend not to use standard RFC822 mail addresses internally, so addressbook data may not include usable addresses when exported. In this case, some post-processing will be needed using a script with access to the Active Directory store to translate the “internal form” addresses to standard RFC822 addresses. This translation is likely to be needed even if Outlook is being retained as a mail client, as it will not be able to use “internal form” addresses when sending mail over standards-based protocols like SMTP.
Some Administrations make considerable use of Outlook's calendar facilities to arrange meetings and manage room bookings. These facilities can be used without Exchange, but there are some limitations.
If concurrent mig