Sitemap and Website Map Generator

The site map is appreciated by search engine robots in the XML version and users in the HTML version in order to navigate the site if necessary.
Currently, the sitemap is expanded with image and video tags, and even with a set of tags that make it the equivalent of an RSS stream.

You can create a site map with just one command with the script provided here, and edit the created document from the built-in viewer (or any text or XML editor), and then upload the file directly to the root of your site.
You must also save the file in XML or text format. The XML format used is a standard created by Google and adopted by Yahoo and Live Search (Microsoft).



Concepts

How do I create a site map?

Using the graphical interface, just specify the name of the home page and click the "Generate" button.

Why make a site map

?

Whether in XML format and saved in Google or HTML, the map allows you to better link to a website. In addition, Google provides link analysis with a report of problems when saving the sitemap.xml file, as well as statistics.
It lists search results and pages that could not be indexed.

Simple map, screen

XML, text, HTML, which format to choose?

The XML format is now recognized by leading search engines. He gives directions to Googlebot and other search robots. This XML document is created by a simple map according to the format originally specified by Google.
- The priority tag indicates which pages are most important.
- lastmod tag - Specifies the last modified date that is used in conjunction with the frequency.
- changefreq tag - shows how often the robot should split the page, some always for a very large site and frequently changing pages to yearly or never for static documents (for example, official format specifications with version number).

The text format provides only a list of page URLs. It is accepted by Google.

The HTML format is intended for visitors to your site. He can view links, headings, descriptions and other information. It can only list part of the pages.
It is managed by search engines and can be used to indicate non-indexed pages to them, especially if there are several levels of directories, and the deepest ones are not always taken into account.

A text file or HTML is a simple URL list, but the XML format consists of tags that follow a standard format.

Sitemap formats

XML format

The container has urlset and contains a number of url tags corresponding to the pages of the site.

<urlset xmlns="https://www.sitemaps.org/index.htmlschemas/sitemap/0.9">
     <url>
        <loc>https://www.iqlevsha.ru/</loc>
        <lastmod>2005-01-01</lastmod>
        <changefreq>monthly</changefreq>
        <priority>0.8</priority>
     </url>  
</urlset>

Images in sitemaps

To index an image, use the following format:

<url>
   <loc>http://example.com/sample.html</loc>
   <image:image>
       <image:loc>http://example.com/image.jpg</image:loc>
   </image:image>
</url>

Learn more about the Google Webmaster Center.

Video in sitemaps

See FAQs about video sitemaps, Google. (English)

Sitemap news

In addition to a URL containing a unique ID, a specific sitemap is required to post your articles to Google News.
This is a standard XML map with added tags.

In fact, these tags turn sitemap into an RSS file:

The sitemap should only contain articles published in the last two days.

sitemap index

An index is a file containing a list of sitemaps. It allows, if you have several sitemaps or the site map is divided into several files, specify their URL.
There is no need to create an index for one sitemap, and even sitemaps of different contents can now be combined into one, as we will see.

The index file is also in standard XML format.
The sitemapindex container and contains a number of sitemap tags.

<sitemapindex xmlns="https://www.sitemaps.org/index.htmlschemas/sitemap/0.9">
     <sitemap>
        <loc>http://www.example.com/sitemap1.xml</loc>
        <lastmod>2004-10-01T18:23:17+00:00</lastmod>
     </sitemap>
</sitemapindex>

Multiple content on the same site

To cope with the multiplication of sitemap file types, Google decided to integrate all content types into one file.
A file with multiple contents looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="https://www.sitemaps.org/index.htmlschemas/sitemap/0.9"
xmlns="https://www.sitemaps.org/index.htmlschemas/sitemap-image/1.1"
xmlns="https://www.sitemaps.org/index.htmlschemas/sitemap-video/1.1">
<url>
<loc>http://www.example.com/mapage.html</loc>
<image:image>
<image:loc>http://example.com/image.jpg</image:loc>
</image:image>
<video:video>
<video:content_loc>http://www.example.fr/mavideo.flv</video:content_loc>
<video:title>Regardez grandir le petit dernier.</video:title>
</video>
</url>
</urlset>

So, three types of tags in the URL tag: loc for the page, image and image: loc for the image file, and video from the video:content_loc..

Important tips for website maps

XML Site Map

HTML site map

RSS Site Map

sitemap index

Check sitemap.xml file

This is the address of the site that will validate your sitemap XML file. You need files:
- sitemap.xsd, format scheme, included in archive.
- sitemap.xml, a list of pages on your website or local computer.
See resources.

Send sitemap.xml sitemap

The XML file must be placed at the root of the site, such as index.html or index.php.

In sitemaps.org opinion, an xml file can be sent in three ways:

  1. Save the map to the search engine website.
  2. Add line to robots.txt.
  3. Query the server using a script or browser.

Add a card

Create an account in Google Webmaster if you don't already have one.
Google will provide you with an identification file to download to your site, and after that you will still return to your Google account and click the "Verify..." button. Then forget about them the day before you return to your account for results.

Ping

You can also save the map via ping, see "What is I do after I create my Sitemap?" in the FAQ mentioned below in the resources.
When your sitemap is updated, you should not register it again, you can tell the search engine via ping:

https://www.google.com/ping?sitemap=http://www.example.com/sitemap.xml

Replace iqlevsha.ru with the URL of your site, and google.com with the corresponding search engine domain: yahoo, ask, etc.

Use robots.txt

According to the Google blog, you can now add an entry to the robots.txt file for the site map, and it will be euthanized when Google robots and other search engines meet with these files.
The syntax is as follows:

User-Agent:*
Disallow:
Sitemap: http://www.example.com/sitemap.xml

The robots.txt file is placed at the root of the site, for example, the sitemap file and the index.html home page, or another.

You can, if you have several sites, give in the robots.txt file one site, the URL of the sitemaps of each site, one per line. Link.

User-Agent:*
Disallow:
Sitemap: http://www.example.fr/sitemap.xml
Sitemap: http://www.example.com/sitemap.xml

Sitemap Generator

How does it work

The program recursively maps the content of the site, starting from the home page, to each page associated with it and builds a list of all pages that will be referenced by search engines.
A valid list of extensions in the source code determines the type of files to undo.
The program is currently running on a local site image. There are a number of sites that offer to build a site map directly on the hosting site.

Syntax:

php smap.php [options] site-url dépôt-local

Example:

php smap.php http://www.example.com c:\example.com  

To view parameters, enter:

php smap.php  

A list of recognized extensions and files to exclude is in the options.php file. You can automatically exclude files with a label:

<meta name="robots" content="noindex">

Shape

You can adapt the program to your site by changing the variables in the options.php file (or option.sol for the source).

By default, the program can work with static WordPress files. Then add the content to the dynamic node map.

Get the program

Get source code

The command line script source is included in the archive. This is a Script program, it is clear and compact thanks to the text processing functions of this programming language.

License from Simple Map: Mozilla 1.1.

Changes

Resources