By Sean Work
Today’s article is a guest post by David Cheng of ShareBloc.
Maintaining a steady blog publishing routine is hard to do. At ShareBloc, we try to blog at least two times a week, either on our blog or syndicated elsewhere. We’re a small team so an aggressive schedule can be overwhelming.
To get enough content to keep up the pace, we used data to help us figure out which topics would be most interesting to our readers. Here are the three data-driven steps we took that you also may want to consider.
Some of the most prolific blogs have two or three posts a day. We realized that, like any great artist, we could “borrow” from these blogs. We scraped 300 blog titles from some of the leading blogs in our space to see which key words or phrases were most popular.
The blogs we chose were KISSmetrics, Marketo, HubSpot, Social Media Today, Content Marketing Institute, and the leading posts from ShareBloc’s sales and marketing bloc. We used popular scraping tool ScrapingHub to get our titles.
To set up your ScrapingHub spider, select the website you’d like the spider to start scraping and input patterns to follow, along with any other special parameters you may have. In this case, we are asking the spider to start with the most recent posts, and include page two, three, etc.
We want the spider to go only one layer deep to get the post title. So, in the spider setting, I’ve set the depth limit to 1, in order to prevent the spider from entering a loop.
Next, you have to set up the item you want scraped. In this case, we want to scrape only the title.
After you set up your parameters, you can review the pages that have been spidered in …read more