Voices of Search
A Searchmetrics Podcast
Episode 16, Pruning Your Website Content

Episode Overview

Pruning your website content. Today, our guest speaker Jordan Koene walks us through the importance of content consolidation and how to prune your website content based on industry best practices and data-driven insights.

Jordan Koene is the CEO of Searchmetrics, Inc.

iTunes       Overcast      Spotify      Stitcher

Episode Transcript

Ben:                             Welcome to the Voices of Search podcast. I’m your host, Benjamin Shapiro. In this podcast, we’re going to discuss the hottest topics in the ever-changing world of search engine optimization. But before we get started, I want to remind you that this podcast is brought to you by the marketing team at Searchmetrics. We are a team of SEOS, content marketers, and data scientists that help enterprise scale businesses monitor their online presence and make data driven decisions using a mix of software and our expertise.

Ben:                             To support you, our loyal podcast listeners, we’re offering a complimentary digital diagnostic consultation. A member of our digital services group will advise you on how you can evaluate your historical performance, identify problem areas that are slowing your growth, and implement a foundation for sustainable success with your SEO and content marketing efforts. Just go to searchmetrics.com/diagnostic to get started.

Ben:                             Joining us again today is Jordan Koene, who is both a renowned SEO strategist and the general manager of Searchmetrics Incorporated. Today, we’re going to be talking about the topic of pruning your content to insure your site’s maximum efficiency. Jordan, welcome back to the Voices of Search podcast.

Jordan:                         Thanks, Ben. Looking forward to sharing a few tips on pruning and content management.

Ben:                             So I’m sure that somewhere in this podcast, there’s going to be plenty of gardening jokes. Let’s just start off by asking you how green of a thumb do you have and are you much of a gardener?

Jordan:                         Yeah, I would say that’s definitely one of my weaknesses. I certainly enjoy getting in the backyard and taking care of the plants, but certainly not one of my strengths by any stretch of the imagination.

Ben:                             More time in front of a laptop than rolling around in the dirt in the backyard, but I guess that’s good for this podcast, now isn’t it?

Jordan:                         It is. I’m great with laptops, not with topsoil, so that’s how it works.

Ben:                             Great, great. Okay. Well, the joke that we’re making here is, you know, the topic of pruning, which really we mean culling through your content and deleting posts, or removing posts, or archiving posts that are no longer relevant to make sure that Google is focusing on the content that you want them to prioritize. So we’re going to talk about a couple of different topics. What exactly is pruning or culling your content? How do you know when to do it, how often should you do it? What can you expect from a good, hearty prune?

Ben:                             So with that said, Jordan, let’s start off by talking about the first topic. What is the goal of pruning or of going through and culling your content?

Jordan:                         Yeah, I mean, it’s really an exercise of prioritization at its core. What we’re looking for here is how do you insure that the best content, the least duplicative or repetitive content is showing up on your website. It’s really about how do you maintain things?

Jordan:                         The best way to describe the situation is every spring you’ve got to clean your garage. You’ve got to get rid of things that just don’t belong in there. Some parts are trash, some parts need to be sold, and some parts need to be given back to other people that are the original owner. That’s kind of the exercise of content consolidation or pruning. It’s something that every website needs to go through. Very similar to cleaning the garage, it’s likely an exercise that needs to be done more frequent than you think.

Ben:                             I just did a quick Google search on pruning. Obviously, it pulled up a gardening reference. According the this, pruning is, “the selective removal of plants, including branches, buds, leaves, blooms, and roots. Pruning can involve the removal of living, dying, or dead plants.” To me, that’s a similar metaphor, where you have to go through and basically pick what is, the area of either a plant or your SEO that you want to grow the most. You sort of trim the things that don’t have a chance at being successful away to give as much sunlight, energy, focus, or link juice in this case, to what your priority is.

Jordan:                         I mean, that’s the name of the game. I think the hard part for most folks that are trying to go through this exercise … I think the novelty behind this is that there’s no set method. Google, as well as webmasters, and engineers, and technical people, they’re not going to walk up to this situation and say, “This is the defacto way of doing your pruning.” They’re not going to give you some sheering scissors and say, “You must cut down the tree,” or they’re not going to give you a saw and say, “Cut down the tree,” and they’re not going to give you a shovel and say, “Shovel the tree out of the dirt.” It’s really dependent on various scenarios. I think that’s what makes this exercise really challenging.

Ben:                             You’re getting onto the topic of how do you know what to prune, right? What are the articles or the webpages that you should deprioritize or unpublish? Let’s start off by talking about whether you move content or deprioritize it or unpublish it. How do you look at how what is in line to get a demotion? Then what do you do with a piece of content that you do want to make a change with?

Jordan:                         Yeah, so this is a great question. The core of the question is hey, where do you start with this content removal, content consolidation, or content pruning process? It really starts with understanding your core performance or analytics.

Jordan:                         One of things that a lot of businesses don’t take into account is the historical record of number of sessions or traffic to the ratio of pages you have. What often happens in these situations is that you have this massive bloating process that happens on your website. Bloating can happen for a variety of different reasons, but the bloating process all of a sudden balloons the number of pages, it increases the number of pages. Before you know it, the number of pages that actually receive traffic significantly decreases to the number of pages that you actually have on the site as a whole. So basically, this ratio gets smaller and smaller and smaller. At one point, you might have, on average, one session or one click per page, and then suddenly you decrease to like .2.

Jordan:                         When those things happen, you’ve got to ask yourself, “Why is that? Why is the number of visitors or the number of sessions or activities that occur on my pages decreasing to the number of pages I have on my site as a whole?” It’s not a metric most websites look at it, but it’s a great way to understand, especially as an SEO or when you’re looking at search traffics specifically, what’s going on. Is something happening, is something changing?

Ben:                             So let me make sure that I understand that. You’re saying the metric that you look at is the number of sessions per page. Not a specific page, but overall.

Jordan:                         That’s correct. That’s a great way to start. That’s a great place to start.

Ben:                             Why is that important?

Jordan:                         The reason it’s important is because one of the realizations that we have come to in the industry is that a lot of folks aren’t really looking at content on an aggregate level. They’re not looking at it on a macro level. So what actually is going on is that a lot of people are thinking about content on a real micro level. They’re saying, “Oh, man. This particular article,” or, “This particular product page,” or, “This particular category page is not performing really well. How do we fix it?” When you’re at that depth, at that level, you often forget, “What’s going on in my website at a macro level? What’s going on with all my category pages? Did I go from like one thousand to a million category pages? Was it an accident by the engineering team or the product team to release something that miraculously ballooned up the number of category pages?”

Jordan:                         These things happen. They happen frequently. The reality is that that’s one of the starting places for most websites when it comes to understanding did something happen over the last six months? The last year? The last three years even in some cases. For a lot of old tech companies, the big websites out there, this is something that’s happened over the last decade or more where their websites have become so big. It’s really hard to understand what to value on these sites.

Ben:                             So essentially the metric that you mentioned, which is the sessions per page, is the way you look in aggregate to see how likely your content in aggregate is to attract a visitor.

Jordan:                         Well, let’s put it this way. Another way to look at this is it’s understanding how many pages are actually generating any value. Because if someone isn’t visiting your pages, well then clearly there’s a problem. Here’s the interesting thing for the folks on this podcast, it’s not just about SEO sessions or SEO visitors or search visitors, it’s also looking at it on a bigger scale than that. It’s looking at just in general, are people trafficking these pages, browsing these pages, finding these pages in search? It’s just understanding on a macro level are there users that are ending up on these pages? Is there a segment of my site where users are never ending up? You’d be surprised. It’s remarkable, but it happens more often than you think.

Ben:                             Right. To me, when I hear that, of like you have .2 sessions per page, that means only one out of five of the pages on a website is getting a visitor in a given period of time. It’s 20% of your pages are actually visited, which is an interesting metric to think about.

Jordan:                         Yeah, and every site kind of varies on these things. I think this is an interesting point, because content consolidation, or as I often call it, content deprecation …

Ben:                             Pruning.

Jordan:                         Yes, of course. It’s often an exercise that many of the large websites, you know, the big websites are typically dealing with. But you’d be surprised, there are bloggers who often encounter this problem, because what they end up doing or what ends up happening on a really small blog is that a lot of the content is very similar, very overlapped. The attention that Google can give to a particular topic or a particular theme is limited. So you can be remarkably surprised by how much of your content or how many of your pages might just not be generating any attention or awareness, not only just from search, but even from your users or your visitors.

Ben:                             Okay, so you’re going to look at your sessions per page to have an understanding of what sections of your website are actually getting activity. That helps you determine when there is inactivity, you should consider deprecating or pruning those pages. Is there any thought to moving pages or relisting or changing pages? Or when you’re going through this process and you realize that you have pages that are not performing, do you just hack them off and kill them?

Jordan:                         So the reality is solving for deprecation and pruning of content can be a challenging process. There are fundamentally two steps to discovering what is actually driving this issue. One of them is typically a traditional method, which is more of understanding the source. Is there something that’s particularly driving this issue? A great example of that is historically there have been things like tagging or setting a methodology in your blog or in your content to show content based on, say, dates or some sort of categorization. What all of a sudden happens is that you’re showing all these pages that essentially have the exact same content, because there isn’t much differentiation between tags, or there isn’t much differentiation between the content with particular dates or of time. That’s one source, one process that you can find this problem out, which is kind of diving into the source and seeing if that source is the driver.

Ben:                             Yeah. So essentially there’s some common threads in pages that can be causing the issue. You’re posting a bunch of content on the same day, and so Google says, “Well, these posts are all old. We’re not going to give them a lot of attention,” or you have the same tagging structure and Google prioritizes one of the pages, not all of them, because they look duplicative.

Jordan:                         Right, exactly, exactly. So that’s one of the methodologies to kind of understand, “Hey, what’s going on on my website, and why is this happening?” Another one that’s a little less clear at times for some brands is essentially this process by which a website starts to generate a tremendous amount of repetitive content. Google has a lot of different terminologies for this, and the search industry has a lot of different terminologies for this, but essentially what it comes down to is when you look at two different scenarios, are they the same thing or are they different?

Jordan:                         I’ll give you an example. One of the most common ones is in the e-commerce space, which is called refinements. Refinements are how do you define what is a particular product? You can refine something based on price, you can refine something based on color, you can refine something based on size. This is a common issue for many e-commerce players. I’m sure that the folks that are in the e-commerce space will sympathize with this, which is all of a sudden your refinement can just totally blow up the number of pages that look exactly the same, because what you’re showing is the exact same set of shoes within, say, a price category, or you’re showing the exact same set of shoes based on a color parameter. This creates a lot of really thin and useless content that Google … genuinely speaking, I think users find annoying and useless. So that’s one example.

Jordan:                         This exact same issue happens across a lot of different industries, not just e-commerce. It can happen in news and media, or you show topic based similarities. It can happen in community sites, like a Yelp or other community or UGC generated websites, where you have a lot of ambiguous type of content that can be categorized in very similar ways. This is a very common and very frequent challenge that many websites face and have to deal with.

Ben:                             Okay, so there’s essentially content that Google looks at and says is duplicative in e-commerce, because there’s only small variations to the page, which makes sense for why that doesn’t rank very well, because Google is saying it already exists multiple different places. What do you do with those types of pages?

Jordan:                         Yes. So how you go about treating these symptoms is a pretty funny and challenging topic, because what ends up happening is that there’s a multitude of rules that you can use to control these things. For everyone listening to this episode, the challenging part is, “Well, how do I fix it?” The most common way to fix it is just remove the page, like kill it, destroy it, remove it from your website. It sounds like a pretty common sense thing to do, but the reality is that that’s not always the thing to do. It’s not always an approach that you can take at scale. I think that’s one of the things that is really hard to understand and comprehend.

Jordan:                         That’s why there’s a multitude of different ways that you can go about fixing it. It includes redirects, it includes consolidation exercises, it can include leveraging different tools like blocking so you can block content from Google in particular, using things like robots.txt or you can also use measurers like the meta directive which is on a particular page, you can use a directive that states, “Google, do not crawl and do not index this particular page. So there’s a collection of tools that exist.

Jordan:                         The hard part … and this is the challenge about the whole thing, about pruning, is that it’s not that easy, and the solutions aren’t always step one, step two. There’s multiple paths that you can take.

Ben:                             I think the thing that concerns me going through the pruning process is you’re deleting pages, you’re taking them out of the rankings, and if you’re not precise, and if you take out the wrong pages, you can negatively impact your performance. I think that’s one of the reasons why people avoid pruning. But that said, talk to us a little bit about how often you should go through this process. What are the results you can expect if you do it the right way?

Jordan:                         So there’s a couple of ways to look at this particular topic. So the first thing is how often do you go through this process? That really depends on how frequently you’re changing content on your website. If you are changing content on your website in a large scale, say on a weekly, monthly basis, this is something that you want to look at, at minimum, on a quarterly basis. This goes to the big websites. This goes to the multimillion, in some cases up to the hundreds of millions of pages, in Google index. So we’re really talking about the top 500 websites that are on the internet here when you’re talking to that kind of scale.

Jordan:                         Now, there’s also the factor that when you’re not at that scale, what is it that you should be looking at? That’s another set of questions that you should be asking yourself, which is like, “Are there changes that are going on on our website that could create a sudden dynamic set of changes? Are you going through a migration? Are you doing any changes in navigation, site structure?” So whenever there’s a taxonomy or a structural change to your site that could all of a sudden increase the number of pages or the number of access points to content, you always want to be doing this exercise. You always want to be evaluating how many pages Google is indexing and looking at.

Jordan:                         So there’s two pieces to this. The first one, and my first example, is hey, how often, how frequently are you publishing or adding content to your site? That’s the first set of criteria. The second set of criteria is are there certain technical or transitional changes that are happening on the website that might require me to evaluate this? On the first one, it’s more of like a cadence. You probably, for your site, you might want to be doing something on a quarterly, annual basis. On the latter one, it’s more determined based on, say, your product road map or some of the changes that might be happening on your site.

Ben:                             So essentially, there’s two different methodologies here. One is setting a cadence to review your content to make sure that you’re removing anything that isn’t performing or is duplicative from your site. You could do that based on how much you’re posting content. So if you’re posting a high volume of content, do it quarterly. If you’re not posting a ton of content, you could probably get away with the pruning process once a year. Then there’s when you have big events, where you’re transitioning parts of your site, you’re changing the underlying taxonomy. When you have a lot of moving pieces, you also want to be going back through and making sure that everything that Google is crawling is something that you think is going to have value.

Ben:                             Okay, so give us some last words of advice going through the pruning process. What else should SEOs know?

Jordan:                         So fundamentally, this is a really challenging process for most SEOs, editorial teams, content owners, webmasters, because it’s really hard to let go. I mean, oftentimes these are assets or structural elements on a website that have existed for a long time. There’s either an owner of them or someone had the idea behind this. So it’s just hard to let go. People don’t want to let go of things that already exist. It’s human nature, to some degree. So I think it’s really important, it’s really prudent, for folks to think about that and understand that. That’s the challenge, that’s the hurdle oftentimes, because sure the technical implementation is challenging, sure the process of discovering this can also be challenging, but at the end of the day, there’s a lot of connection and sentiment to much of this content. That can be, oftentimes, the hardest part in terms of getting people to let go.

Jordan:                         Now, the reality is that one of the greatest things, in my opinion … I’ve had this experience both at Searchmetrics and outside of Searchmetrics … is sometimes you just need to have someone else from the outside come in and help take a look at these situations. That’s something that we’ve done often at Searchmetrics, which is come in and objectively say, “Hey look, there are certain parts of a website you don’t need to have. You should get rid of these, because it’s going to help your business overall.” So if it comes to that point, it’s often very useful to have that second set of eyes to help you make that decision.

Ben:                             Just like landscaping, sometimes it’s useful to pull that old, dead tree stump out of the ground to let something else grow there. With that, I think we’ll use a gardening metaphor to wrap up these episode of the Voices of Search podcast.

Ben:                             Thanks for listening to my conversation with Jordan Koene, the CEO of Searchmetrics. We’d love to continue the conversation and review, so if you’re interested in contacting Jordan, you can find links to his bio in our show notes, or you can shoot him an SEO related tweet to @jtkoene. That’s J-T-K-O-E-N-E on Twitter. If you have any general marketing questions, or if you want to talk about podcasting, you can find my contact information in our show notes, or you can send me a tweet @benjshap. That’s B-E-N-J-S-H-A-P.

Ben:                             If you’re interested in learning more about how to use search data to boost your organic traffic, online visibility, or to gain competitive insights, head over to searchmetrics.com for a free tour of our platform. If you like this podcast, and you want a regular stream of SEO and content marketing insights in your feed, hit the subscribe button in your podcast app. Lastly, if you’ve enjoyed this show and you’re feeling generous, we would be honored for you to leave a review in the Apple iTunes store. It’s a great way for us to share our [learnings 00:25:20] about SEO and content marketing.

Ben:                             Okay, that’s it for today, but until next time, remember, the answers you’re looking for are always in the data.