MOSS MVP

I've moved my blog to http://blog.falchionconsulting.com!. Please update your links. This blog is no longer in use--you can find all posts and comments at my new blog; I will no longer be posting to this site and comments have been disabled.

Tuesday, September 4, 2007

Convert a Sub-site to a Site Collection

I finally figured it out! This was supposed to be one of those very simple tasks that I should have been able to do without any custom code. Turning a sub-site (or web) into a site collection (or top level site) turned out to be the most difficult task I've yet to face with SharePoint 2007. In theory you should be able to do this using the following commands which could be put into a batch file:

REM Create a test web for exporting
stsadm -o createweb -url "http://intranet/testweb" -sitetemplate "SPSTOPIC#0"

REM Export the test web to the filesystem
stsadm -o export -url "http://intranet/testweb" -filename "c:\testweb" -includeusersecurity -versions 4 -nofilecompression -quiet

REM Create a managed path for the new top level site
stsadm -o addpath -url "http://intranet/testsite" -type explicitinclusion

REM Create an empty site with a default site template (note that if you don't specify a template you have to manually activate the required features)
stsadm -o createsite -url "http://intranet/testsite" -owneremail "someone@example.com" -ownerlogin "domain\username" -sitetemplate "SPSTOPIC#0"

REM Import the site
stsadm -o import -url "http://intranet/testsite" -filename "c:\testweb" -includeusersecurity -nofilecompression -quiet

Unfortunately what you get is only a partially functional site (and in some case not functional at all). There are several errors that you are likely to encounter after running the above using the created testweb or your own existing web. The first and most obvious error is that when you load the default.aspx page of the new site you may get a File Not Found error (note that running the above as is will not give you this error). This is the result of the publishing pages PageLayout URL getting messed up (effectively still pointing to an old value). I addressed this specific issue with a separate command (http://stsadm.blogspot.com/2007/08/fix-publishing-pages-page-layout-url.html) and the I've encapsulated that functionality into the new commands I've created which are detailed below.

The next error you're likely to see is on the Area Template Settings page (Site Settings -> Page layouts and site templates). The specific error is "Data at the root level is invalid. Line 1, position 1".

Anyone who's done a lot of XML work should recognize this error as an XML parsing error. This error occurs if the web you imported from was set to inherit it's page layouts from it's parent. When a web is setup this way there's a property called "__PageLayouts" which gets set to "__inherit".

For a top level site collection this value should always be either an empty string (all page layouts are available) or XML describing which layouts are available. The import operation does not consider this and leaves the value as is thus resulting in the XML error when attempting to parse "__inherit" as XML. The fix for this is simple enough - change the value to an empty string. Unfortunately that's not all we have to do. Fixing the above error results in the page loading without errors, however, the page layouts section does not load. There's still several issues that need to be resolved. If you now go and view the master gallery (Site Settings -> Master pages and page layouts) you should see all the default page layouts. If you have any custom page layouts those won't exist and will cause problems.

Also, if you attempt to edit a file you'll notice that even though we used a publishing template it doesn't prompt you to check the file out. What's more is that once you view the form for a layout you should see that only the core fields are present (no Content Type, Associated Content Type, Variations, etc.).

There are several things that need to be fixed here - first we need to activate all the features that would otherwise be activated on a new site collection (need this so that we can get the publishing workflow options enabled). Second we need to reset all the properties for the gallery to match that of our source gallery (namely we need to allow management of content types).

Third we have to change the ContentType field from being a Text field to being a Choice field (more about this in a minute). Fourth we need to re-associate each file as a Page Layout file by setting all the necessary properties (Content Type, Associated Content Type, etc.). In regards to changing the ContentType field this is the one that caused me the most headache to figure out. For some reason during the import of the site this field gets a bit messed up (note that I'm not referring to the ContentType field that is linked in from the Page content type which is associated with the library but rather another field that is part of the gallery definition itself). The field should be a Choice field with options such as "Page Layout", "Publishing Mater Page", "Master Page", and "Folder". However, during the import the field is converted to a Text field - don't ask me why.

The result is that when you query for the list of available page layouts the unmanaged code that Microsoft uses to do the actual query chokes because it can't find a matching value in the list so it produces an invalid query which will always return no results back. To fix this I copy the source master page gallery on top of the target gallery using the content deployment API. I also found that (with SP2) the PublishingResources feature seemed to correct the issue (at least in the tests that I ran).

UPDATE 9/4/2007: I just discovered that another issue is related to the global navigation. If you view the navigation via the browser it will look as though everything is just peachy but if you attempt to manipulate the navigation programmatically you'll find that the PublishingWeb's GlobalNavigationNodes property is empty (no items are present). This is because the global navigation is stored at the site collection level so when you import the web it does not take the global navigation with it, just the current. The fix is simple enough - just loop through the current navigation collection and copy it to the global navigation collection. This will help to allow other programmatic manipulations of the global navigation to succeed.

UPDATE 7/6/2009:  I am now calling the code that I created to copy content types from one site collection to another.  This solves issues that can occur if a site collection content type was not created via a Feature.  I've also added additional logging to better show what is happening.

So, to summarize the things that need to be repaired after importing into your empty site collection:

  1. Activate any features that are needed by the site
  2. Set the master page gallery settings
    1. Enable content type management
    2. Set your publishing options
    3. Fix the Content Type field so that it's a Choice field type by copying the source master page gallery
  3. Copy missing content types from the source site collection.
  4. Fix the Page Layouts and Site Templates
    1. Set the "__PageLayouts" property to an empty string (can then be set to something else using SetAvailablePageLayouts() but first needs to be set to an empty string as SetAvailablePageLayouts() will not work until it's fixed)
    2. Copy any missing page layouts from the source
    3. Set all appropriate properties on each page layout
  5. Fix all the publishing pages PageLayout property to have the correct URL
  6. UPDATE 9/4/2007: Update the global navigation

Some of the above can be done via the browser but most of it requires programmatic changes. In order to solve all these problems I've created two custom stsadm commands - the first will take a site make all the repairs identified above (so it assumes you've already imported the site).

The second basically just abstracts the whole process of exporting a web, creating a site, importing into the site, and then repairing the site (this way the entire process can be done with just one command). The commands I created are detailed below (forgive the verbosity of the names - I had trouble coming up with something shorter).

1. gl-repairsitecollectionimportedfromsubsite

The code is fairly well documented so rather than discuss it all (there's a lot of it) I've linked it to this post here. The syntax of the command can be seen below:

C:\>stsadm -help gl-repairsitecollectionimportedfromsubsite

stsadm -o gl-repairsitecollectionimportedfromsubsite

Repairs a site collection that has been imported from an exported sub-site.  Note that the sourceurl can be the actual source site or any site collection that can be used as a model for the target.

Parameters:
        -sourceurl <source location of the existing sub-site or model site collection>
        -targeturl <target location for the new site collection>

The following table summarizes the command and its various parameters:

Command Name Availability Build Date
gl-repairsitecollectionimportedfromsubsite WSS 3, MOSS 2007 Released: 9/4/2007
Updated: 7/6/2009

Parameter Name Short Form Required Description Example Usage
sourceurl source Yes The URL to the source sub-site to convert. -sourceurl http://portal/subsite

-source http://portal/subsite
targeturl target Yes The URL of the new site collection to create. -targeturl http://portal/sites/site

-target http://portal/sites/site

Here’s an example of how to repair the site created using the batch file above:

stsadm –o gl-repairsitecollectionimportedfromsubsite –sourceurl "http://intranet/testweb/" -targeturl "http://intranet/testsite/"

2. gl-convertsubsitetositecollection

As stated above, this command is just an abstraction of other commands - it simply calls out to stsadm to do export the site (note that you can provide a previously exported site file/folder), create the managed path, create the empty site, import the site, and finally repair the imported site. As there's nothing spectacular going on here I didn't bother culling the code out in this post (download the project if you're interested in the details). The syntax of the command can be seen below:

C:\>stsadm -help gl-convertsubsitetositecollection

stsadm -o gl-convertsubsitetositecollection


Converts a sub-site to a top level site collection via a managed path.

Parameters:

        -sourceurl <source location of the existing sub-site or model site collection>
        -targeturl <target location for the new site collection>
        -owneremail <someone@example.com>
        [-createmanagedpath]
        [-haltonwarning]
        [-haltonfatalerror]
        [-includeusersecurity]
        [-suppressafterevents (disable the firing of "After" events when creating or modifying list items)]
        [-exportedfile <filename of exported site if previously exported>]
        [-nofilecompression]
        [-ownerlogin <DOMAIN\name>]
        [-ownername <display name>]
        [-secondaryemail <someone@example.com>]
        [-secondarylogin <DOMAIN\name>]
        [-secondaryname <display name>]
        [-lcid <language>]
        [-title <site title>]
        [-description <site description>]
        [-hostheaderwebapplicationurl <web application url>]
        [-quota <quota template>]
        [-deletesource]
        [-createsiteinnewdb]
        [-createsiteindb]
        [-databaseuser <database username>]
        [-databasepassword <database password>]
        [-databaseserver <database server name>]
        [-databasename <database name>]
        [-verbose]

The following table summarizes the command and its various parameters:

Command Name Availability Build Date
gl-convertsubsitetositecollection WSS 3, MOSS 2007 Released: 9/4/2007
Updated: 7/6/2009

Parameter Name Short Form Required Description Example Usage
sourceurl source Yes The URL to the source sub-site to convert. -sourceurl http://portal/subsite

-source http://portal/subsite
targeturl target Yes The URL of the new site collection to create. -targeturl http://portal/sites/site

-target http://portal/sites/site
owneremail oe Yes

The site owner's e-mail address.  Must be valid e-mail address, in the form someone@example.com.

-owneremail someone@example.com

-oe someone@example.com
createmanagedpath createpath No Create a new managed path for the site collection. -createmanagedpath

-createpath
haltonwarning warning No Stop execution of the command if a warning event occurs during the export or import process. -haltonwarning

-warning
haltonfatalerror error No Stop execution of the command if a fatal error occurs during the export or import process. -haltonfatalerror

-error
exportedfile file No Use a previously exported site (created using stsadm's export command). -exportedfile c:\exportdata\site

-file c:\exportdata\site
nofilecompression   No Do not compress the site when exporting (or if previously exported use an uncompressed file for the import). -nofilecompression
ownerlogin ol

If your farm does not have Active Directory account creation mode enabled, then this parameter is required.

This parameter should not be provided if your farm has Active Directory account creation mode enabled, as Microsoft Office SharePoint Server 2007 will automatically create a site collection owner account in Active Directory based on the owner e-mail address.

The site owner's user account.  Must be a valid Windows user name, and must be qualified with a domain name, for example, domain\name

-ownerlogin domain\name

-ol domain\name
ownername on No

The site owner's display name.

-ownername "Gary Lapointe"

-on "Gary Lapointe"
secondaryemail se No

The secondary site owner's e-mail address.  Must be valid e-mail address, in the form someone@example.com.

-secondaryemail someone@example.com

-se someone@example.com
secondarylogin sl

If your farm does not have Active Directory account creation mode enabled, then this parameter is required.

This parameter should not be provided if your farm has Active Directory account creation mode enabled, as Microsoft Office SharePoint Server 2007 will automatically create a site collection owner account in Active Directory based on the owner e-mail address.

The secondary site owner's user account.  Must be a valid Windows user name, and must be qualified with a domain name, for example, domain\name

-secondarylogin domain\name

-sl domain\login
secondaryname sn No

The secondary site owner's display name.

-secondaryname "Pam Lapointe"

-sn "Pam Lapointe"
lcid   No

A valid locale ID, such as "1033" for English.  You must specify this parameter when using a non-English template.

-lcid 1033
title t No

The title of the new site collection (this value will be overwritten when the site is imported - it is available only to help in situations in which the import fails).

-title "New Site"
description desc No

Description of the site collection (this value will be overwritten when the site is imported - it is available only to help in situations in which the import fails).

-description "New Site Description"

-desc "New Site Description"
hostheaderwebapplicationurl hhurl No

A valid URL assigned to the Web application by using Alternate Access Mapping (AAM), such as "http://server_name".

When the hostheaderwebapplicationurl parameter is present, the value of the url parameter is the URL of the host-named site collection and value of the hostheaderwebapplicationurl parameter is the URL of the Web application that will hold the host-named site collection.

-hostheaderwebapplicationurl http://newsite

-hhurl http://newsite
quota   No

The quota template to apply to sites created on the virtual server.

-quota Portal
deletesource   No Delete the source site after conversion (only recommended if significant testing has occurred). -deletesource
createsiteinnewdb newdb No Create the site collection in a content database. -createsiteinnewdb

-newdb
createsiteindb db No Create the site collection in an existing content database. -createsiteindb

-db
databaseserver ds No The database server containing the specified content database.  If not specified then the default database server is used. -databaseserver spsql1

-ds spsql1
databaseuser du No

The administrator user name for the SQL Server database.

-databaseuser domain\user

-du domain\user
databasepassword dp No

The password that corresponds to the administrator user name for the SQL Server database.

-databasepassword password

-dp password
databasename dn Yes if createsiteinnewdb or createsiteindb is specified.

The name of the content database to put the site collection in (will be created if createsiteinnewdb is specified).

-databasename SharePoint_Content1

-db SharePoint_Content1
suppressafterevents sae No Disable the firing of After events when creating or modifying files or list items during the import. -suppressafterevents

-sae
verbose v No Displays logging information when executing. -verbose

-v

Here's an example of how to do all that the batch file above is doing (minus the creation of the testweb) as well as the repair operation all with one command:

stsadm –o gl-convertsubsitetositecollection –sourceurl "http://intranet/testweb/" -targeturl "http://intranet/testsite/" -createmanagedpath -nofilecompression -owneremail "someone@example.com" -ownerlogin "domain\user" -deletesource

One area of improvement may be to pull the owner and secondary owner information from the source site collection so that this information does not have to be provided - maybe I'll do that if I feel I have the time or if people express enough interest. Note that you can specify a title and description but they'll be overwritten during the import - I only included them so that if the import fails and you're left with an incomplete site you'll at least have a name for it if you should forget to delete it and stumble upon it a year later.

Figuring out how to solve all the issues surrounding converting a web to a site collection was a real pain the a$$ so any feedback that people have on this would be greatly appreciated - hopefully if there are others out there that have stumbled on this then they'll benefit from it as well. Keep in mind also that though I think I've solved all the errors related to the conversion it's possible that different implementations may have additional errors that I have not seen - if that's the case please let me know (especially if you've solved the problems) so that I can share with others.

Update 9/21/2007: I've fixed a couple minor bugs that pop up when converting a non-publishing site. I've also enhanced the command to take advantage of another new command I created: gl-updatev2tov3upgradeareaurlmappings (updates the url mapping of V2 bucket webs to V3 webs thereby reflecting the change of url as a result of the move so if a user tries to hit the V2 url it will redirect to the new and updated V3 url).

Update 10/2/2007: I've enhanced the command to take advantage of another new command I created: gl-retargetcontentquerywebpart (fixes Grouped Listings web parts that remained pointed at the old list rather than the newly imported list).

Update 10/12/2007: I've removed the retainobjectidentity parameter. If you attempt to use this parameter you will receive a syntax error. Turns out that retaining the object identity when going from a sub-site to a site collection just created a nightmare. However, because I still had to handle these web parts that were broken I decided to enhance the repair routines to manually retarget the DataFormWebPart and ContentByQueryWebPart web parts. So if a matching list can be found on the source then any of these web parts on your pages should be fixed so that you don't have to manually fix them (the gl-repairsitecollectionimportedfromsubsite command will do the same).

Update 7/6/2009: I removed the direct DB access code and added support for copying content types from the source.

205 comments:

«Oldest   ‹Older   201 – 205 of 205
Andrea said...

Great job, really! But i'm having a problem after exporting and trying to import in an existing contentdb (i created it on sql, empty): is telling me content database not found.
Any help?
Have a nice day!
Andrea

Gary Lapointe said...

That's because SharePoint doesn't know anything about databases created directly in SQL Server. You have to add the content database using stsadm or central admin (you can leave the empty db you created and just point SharePoint to it). Then the command should work.

Andrea said...

Thank you, that fixed the problem.
Now i'm having problems tryingo to convert 54 gb subsites into one site collection db, in order to migrate it into 2010 server.
After exporting the site (succesfully), when starting importing, after a while i'm having a server rebbot due to heavy usage of RAM and CPU... Do you suggest me to separate the commands, and to start first with export, and in a separate moment make an import with the fix for links?
Thank you for your answer
Andrea

Andrea said...

Hello Gary,
sorry for asking so many questions, but this tool is the only way to export our projects site (with plenty of sub webs, one for each project - 130 subwebs-).
Last night it didn't happen he reboot of the server, but i got after 12 gb of import (54 gb was the export) and error like this:
---------------
Error: The file you are attempting to save or retrieve has been blocked from this Web site by the server administrators.
[12/13/2010 4:41:37 AM]: FatalError: Item does not exist. It may have been deleted by another user.
---------------
This is a test export/import operation. That error happened because i didn't put the source content db in read only mode?
Thank you for your help
Andrea

Gary Lapointe said...

Take a look at your blocked file types. Most likely you have, on your test environment, an extension that is blocked but is not blocked on your source environment.

«Oldest ‹Older   201 – 205 of 205   Newer› Newest»