Building a Search Engine Friendly Sitemap XML with Laravel
Published on by Eric L. Barnes
A few years ago search engines recommended submitted sitemaps to help with indexing your website and now the importance of this is debatable.
I’m of the mindset creating and submitting can’t hurt, so I spent a little time putting one together and wanted to share how easy this is in Laravel.
What is a sitemap
If you are not familiar with a Sitemap, Google defines it as:
A sitemap is a file where you can list the web pages of your site to tell Google and other search engines about the organization of your site content. Search engine web crawlers like Googlebot read this file to more intelligently crawl your site.
They also outline the following reasons on why you need one:
- Your site is really large. As a result, it’s more likely Google web crawlers might overlook crawling some of your new or recently updated pages.
- Your site has a large archive of content pages that are isolated or well not linked to each other. If you site pages do not naturally reference each other, you can list them in a sitemap to ensure that Google does not overlook some of your pages.
- Your site is new and has few external links to it. Googlebot and other web crawlers crawl the web by following links from one page to another. As a result, Google might not discover your pages if no other sites link to them.
- Your site uses rich media content, is shown in Google News, or uses other sitemaps-compatible annotations. Google can take additional information from sitemaps into account for search, where appropriate.
Your site might not hit those criteria, but as I mentioned, I still believe it’s worth submitting one just to be safe.
The Sitemap Protocol
On the official Sitemaps website it outlines all the information you will need for building your own sitemap. Instead of reading through the whole spec here is a basic sample with a single url entered:
<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://www.example.com/</loc> <lastmod>2005-01-01</lastmod> <changefreq>monthly</changefreq> <priority>0.8</priority> </url></urlset>
As you can see, it’s just an XML file with an individual <url>
for each of your sites pages.
A single file can hold around 50,000 records, but you also have the ability to separate them out into multiple files and utilize an index file to point to the others.
The spec outlines this style like this:
<?xml version="1.0" encoding="UTF-8"?><sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap> <loc>http://www.example.com/sitemap1.xml.gz</loc> <lastmod>2004-10-01T18:23:17+00:00</lastmod> </sitemap> <sitemap> <loc>http://www.example.com/sitemap2.xml.gz</loc> <lastmod>2005-01-01</lastmod> </sitemap></sitemapindex>
Inside each <loc>
it points to a file that it includes the <url>
items like in the first example.
For this tutorial, I’m going to use the index style because I have records coming from different tables. This allows me the opportunity to customize each list of URL’s in the correct format without having to do extra processing.
Building a Sitemap Controller.
My sitemap will have two primary sections, and they are blog posts, blog categories, and podcast episodes. Each of these will be in their own file, and the index will point to them.
To get started let’s create a new sitemap controller:
php artisan make:controller SitemapController
Now open this file and let’s create a sitemap index.
Creating the Sitemap Index
Create a new index method that will generate the XML needed:
public function index(){ $post = Post::active()->orderBy('updated_at', 'desc')->first(); $podcast = Podcast::active()->orderBy('updated_at', 'desc')->first(); return response()->view('sitemap.index', [ 'post' => $post, 'podcast' => $podcast, ])->header('Content-Type', 'text/xml');}
The post and podcast queries are needed to generate the last modified timestamp in our index view as that informs the crawlers if new content has been added since they last looked.
Also, if you are not familiar with this return style what it is doing is returning a response object, with an assigned view, and setting the text/xml header. If you only return a view()
then the header isn’t available so by including the response first this gives you access to include it.
Next our sitemap.index view file will look like this:
<?php echo '<?xml version="1.0" encoding="UTF-8"?>'; ?> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap> <loc>https://laravel-news.com/sitemap/posts</loc> <lastmod>{{ $post->publishes_at->tz('UTC')->toAtomString() }}</lastmod> </sitemap> <sitemap> <loc>https://laravel-news.com/sitemap/categories</loc> <lastmod>{{ $post->publishes_at->tz('UTC')->toAtomString() }}</lastmod> </sitemap> <sitemap> <loc>https://laravel-news.com/sitemap/podcasts</loc> <lastmod>{{ $podcast->publishes_at->tz('UTC')->toAtomString() }}</lastmod> </sitemap</sitemapindex>
For this view, I’m just taking my custom publishes_at timestamp and using Carbon to set the timezone to UTC and finally using the Carbon Atom string helper to format it correctly.
Creating the Sitemap URL file
The next step is creating each URL file. The controller gets three new methods, and all are very similar. Here is an example:
<br></br>public function posts(){ $posts = Post::active()->where('category_id', '!=', 21)->get(); return response()->view('sitemap.posts', [ 'posts' => $posts, ])->header('Content-Type', 'text/xml');} public function categories(){ $categories = Category::all(); return response()->view('sitemap.categories', [ 'categories' => $categories, ])->header('Content-Type', 'text/xml');} public function podcasts(){ $podcast = Podcast::active()->orderBy('updated_at', 'desc')->get(); return response()->view('sitemap.podcasts', [ 'podcasts' => $podcast, ])->header('Content-Type', 'text/xml');}
Once that is setup here is an example view file from sitemap.posts:
<?php echo '<?xml version="1.0" encoding="UTF-8"?>'; ?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> @foreach ($posts as $post) <url> <loc>https://laravel-news.com/{{ $post->uri }}</loc> <lastmod>{{ $post->publishes_at->tz('UTC')->toAtomString() }}</lastmod> <changefreq>weekly</changefreq> <priority>0.6</priority> </url> @endforeach</urlset>
With that in place and then duplicated for each of your sections it should be ready to go as soon as you add your routes. Here is how mine are setup:
Route::get('/sitemap', 'SitemapController@index');Route::get('/sitemap/posts', 'SitemapController@posts');Route::get('/sitemap/categories', 'SitemapController@categories');Route::get('/sitemap/podcasts', 'SitemapController@podcasts');
You may notice I elected to not use the XML file extension and that is not a requirement. However, if you would like to use it then you can add the extension to the route:
Route::get('/sitemap.xml', 'SitemapController@index');
Then just adjust the views to point to the proper location.
Wrap up
As you hopefully see building your sitemap is not that difficult with Laravel, especially when your site structure is simple like mine. If you’d like to automate this whole process packages do exist and it might save you time using those.
If you’d like to go further, you can also build image and video sitemaps, as well as create your own XSL stylesheet.
Happy SEOing!
Eric is the creator of Laravel News and has been covering Laravel since 2012.