Was working on a WordPress SEO theme update and as always looking for anything to improve overall site performance (site speed is a Google ranking metric, squeezing a few extra milliseconds out of a feature is worth the effort) and struggling to understand the logic behind the WordPress Embed (oEmbed) features, especially since WordPress 4.4.

The oEmbed features can generate a lot of what appear to be unnecessary database entries and a lot of external HTTP get requests (though only when the oEmbed cache is created for the first time) : both generally not conducive to improved WordPress SEO performance.

Large database tables aren’t a performance issue per se, but if WordPress is adding unnecessary items to the database, WordPress is probably using the database items in some way and this will have a negative performance impact. If a WordPress feature is adding data to the PostMeta table and it’s loaded every time the relevant Post is called (via $post->ID) that’s not ideal.

For example my General Election WordPress site has almost 3,000 entries in the PostMeta table, not a particularly large database table, but over 1,000 of the rows are related to the WordPress oEmbed features and of those 681 of them have a meta_key value of {{unknown}}.

WordPress oEmbed PostMeta Unknown Entries

WordPress oEmbed PostMeta Unknown Entries

Those 681 oEmbed related entries will be accessed by WordPress when Posts are loaded etc…, for example there’s 24 of the {{unknown}} entries for the post with Post-ID 133.

Post 133 is BNP Immigration Policy which has 333 comments (as it happens all 24 entries are due to a Stallion Responsive theme feature that allows YouTube embeds in comments). I believe WordPress accesses those 24 database entries every time that post is loaded??? If so that’s a waste of server resources.

Time to investigate.

What are WordPress oEmbeds?

WordPress includes the ability to embed content from 3rd party websites like YouTube, Twitter, WordPress.org etc… using the oEmbed format.

The YouTube video below is an example of embedded content.

Wikipedia oEmbed description: oEmbed is an open format designed to allow embedding content from a website into another page.

An oEmbed exchange occurs between a consumer and a provider. A consumer wishes to show an embedded representation of a third-party resource on their own website, such as a photo or an embedded video. A provider implements the oEmbed API to allow consumers to fetch that representation.

In English :-) that means a provider (YouTube for example) supplies content in a format a consumer (a WordPress blog like the general election site for example) can easily convert to embedded content (in an iFrame for example) on the consumers site without having to understand much about embedding content: paste a plain text URL on a new line, WordPress oEmbed functions does the rest.

WordPress has included oEmbed features since version 2.9, but in WordPress version 4.4 there was an increase in features and all core oEmbed options (options to turn the feature on/off) were removed: to turn oEmbed off in WordPress 4.4+ requires a plugin.

I’ve used WordPress built in embed features for years, it’s awesome for quickly embedding YouTube videos for example. Stallion Responsive includes a feature to extend YouTube embedding to comments (not a core WordPress embed feature) allowing commenter’s to include YouTube videos simply by adding the URL to the YouTube video (same way it works in a Post). To include a YouTube video in a Post (or comment if using Stallion Responsive) simply include a YouTube URL on a separate line. For example the YouTube video below was achieved by simply pasting this URL https://www.youtube.com/watch?v=mgVw9TmqSL0 on a separate line below.

WordPress automatically creates the relevant iFrame code to embed the YouTube video with width 500px and height 375px : the dimensions can be set by the WordPress theme or a plugin or you can add embed shortcode and include the dimensions. Stallion Responsive includes CSS code so the YouTube iFrame is also mobile responsive, by default the YouTube iFrame isn’t mobile responsive.

WordPress oEmbed Content Stored in the PostMeta Table

The iFrame code is stored in the WordPress sites database in the wp_postmeta table**

** In your WordPress sites wp-config.php files is an option called Table Prefix which by default is set to “wp_” –

$table_prefix  = 'wp_';

If this is set to something other than “wp_“, for example “uniq_“, the PostMeta table name will reflect this uniq_postmeta.

What you’ll find in your WordPress PostMeta table are entries like the ones in the screenshot below:

WordPress oEmbed Cache PostMeta Unknown Entries

WordPress oEmbed Cache PostMeta Unknown Entries

The two ticked entries are related to a YouTube video embedded in a post.

If you aren’t familiar with database tables the column labelled:
meta_id” – the number is the entry number 1, 2, 3…. 9999….
post_id” – the ID of the Post this entry is linked to.
meta_key” – identify what the entry is related to (oEmbeds in this case)
meta_value” – the content of the entry, what WordPress might output on the page.

meta_id 18378 is related to the WordPress Post with post_id 2661 and is called _oembed_time_0c49bb5cc5d9c944befa406fe9f22f41 and the content is 1476117920

The above is a WordPress oEmbed cache timestamp (the number 1476117920 relates to a specific date). Reasonably confident the timestamp is used to rebuild the cache for the relevant valid embed code: under certain scenarios the URL (YouTube URL for example) is rechecked to see if the iFrame code has changed. Not sure when this happens, used to be on Updating a Post, but think it’s changed recently to be smarter: if you updated a post say 20+ times in a short period of time it could result in some providers (Twitter for example) blocking your site temporarily, pretty sure that’s no longer possible.

meta_id 18377 is related to WordPress Post with post_id 2661 and is called _oembed_0c49bb5cc5d9c944befa406fe9f22f41 and the content is <iframe width=”500″ height=…

This is the iFrame HTML code (it’s truncated, you can see it all in the screenshot above as a hoverover) to embed a YouTube video. The iFrame code replaces the plain text YouTube URL pasted on a separate line.

Note how they both include “0c49bb5cc5d9c944befa406fe9f22f41” as part of the meta_key, that’s because the jumble of characters is a 32bit MD5 hash of the URL etc…

_oembed_time_32bit_md5-hash-of-url-and-args” and “_oembed_32bit_md5-hash-of-url-and-args

A specific URL will always have the same _oembed_ entries no matter what post/site they are added to. If you add the same YouTube video to multiple posts for example it will generate multiple identical entries (other than the meta_id and post_id) in the PostMeta table, this is normal behavior. If you find duplicate entries, but the same URLs are NOT posted in the multiple posts, it’s probably a plugin or theme is messing up (look for “global $post;” and “$post->ID” in the plugin/themes code, it might be the cause).

WordPress oEmbed Cache {{unknown}} PostMeta Entries

I have no issue with how this works with YouTube, Twitter etc…, my issue is understanding the logic behind entries with meta_value listed as {{unknown}}. You can see from the screenshot above Post 2661 has 8 entries listed as {{unknown}}.

The Post with post_id 2661 is the Languagenut Review which includes a single embedded YouTube video (the one related to meta_id 18378 and 18377) and nothing else I want embedding.

When I started researching embed/oEmbed a few days a go I had no idea what the 8 {{unknown}} entries where related to. Research indicated WordPress themes and plugins could cause these {{unknown}} oEmbed entries (basically bugs/mistakes), if a theme/plugin uses:

global $post;

It might cause WordPress oEmbed functions to trip outside when a posts content is loaded. By default WordPress oEmbed functions are called when the WordPress functions the_content() (a posts content) or the_excerpt() (a posts excerpt) are called: I think it’s “$post->ID” that trips the oEmbed feature, since calling content/excerpts other ways trip the oEmbed functions: get_the_excerpt() also trips oEmbed. With this in mind it’s going to be called on pretty much every archive and WordPress Post that’s loaded where the embedded content hasn’t already been cached and any feature that access a posts excerpt or content.

WordPress oEmbed Code Bugs?

Some themes/plugins set $post; as a global when they want access to the Posts post_id, for example Stallion Responsive 8.4 uses this a lot to get the post_id for generating the different Stallion Related Keywords anchor text etc… in the Stallion SEO Posts Widget.

If I understand things correctly, the issue with using “global $post;” in themes/plugins is under some scenarios this will trip the oEmbed functions for the post_id’s accessed that way: so a widget that accesses “$post->ID” might result in oEmbed PostMeta entries for the wrong post. For example if Post 1 has a YouTube video embedded, but Post 2 doesn’t, if Post 2’s “$post->ID” is loaded by the widget incorrectly while on Post 1, the YouTube video will be considered embedded on Post 1 and Post 2 (double the oEmbed entries in the database).

I’ve not replicated the above, but that’s how I’ve interpreted what I’ve read from multiple bug like issues.

This is what I thought was happening with all the {{unknown}} oEmbed entries (a Stallion Responsive 8.4 theme code bug), so set out to fix it. As you’ll read below, it’s not a code bug generating the {{unknown}} oEmbed entries, what I’d been reading about were bugs causing additional oEmbed entries and some of them were {{unknown}} entries.

Avoid Using $post->ID if get_the_ID() Works

In hindsight I’d been using the wrong code (it worked, but it wasn’t efficient). I was using “global $post;” so I could get the post_id outside the WordPress loop via “$post->ID” in a theme feature and had used the same code inside the loop (in the Stallion SEO Posts widget for example) where it wasn’t needed (it works, but it’s not the best way).

In almost every case (in Stallion Responsive 8.4) “global $post;” could be deleted and the WordPress default function for getting the Post ID “get_the_ID()” could replace “$post->ID“.

After spending two days removing instances of “global $post;” and replacing “$post->ID” with “get_the_ID()” and a LOT of testing, I’d removed the vast majority. Stallion Responsive includes a LOT of features built from various plugins, widgets and code snippets: dozens of features equals a lot of code to check/fix/retest.

Since I was removing global $posts; that weren’t needed, I looked at and removed many more unnecessary global’s (like global $wpdb; that wasn’t always needed), you have to remember some WordPress themes and plugins are built by enthusiasts and they don’t always use the best code in performance terms. If you use another WordPress developers code without checking its ‘quality’, you’ll use inefficient code. I’ve done this myself, you add additional code (global %post; for example) to get something to work quickly and leave it in the final code because it works.

After all that effort I was looking forward to only seeing ‘valid’ oEmbed entries in the PostMeta table, wasn’t happy to find pretty much the same output! Stallion Responsive 8.4 didn’t have an oEmbed bug, it’s WordPress doing what it’s supposed to do when it finds URLs on a separate line :-)

WordPress Embed Discovery Tries to Embed Everything!

I’d misunderstood how oEmbed was working in WordPress 4.4+ (I was developing in WordPress 4.6.1). I thought oEmbed was only tripped for URLs that matched a whitelist of providers (YouTube, Twitter, Facebook etc…). I wrongly believed WordPress would check a URL to see it it was on the whitelist and if it was trip the oEmbed functions: get the URL etc… to see if there was relevant embed code, if it wasn’t on the whitelist, do nothing.

See the full whitelist list at the link below:

https://codex.wordpress.org/Embeds Update: WordPress moved the article from a sub-domain to the main site, so the embed code changed, the main WordPress domain is whitelisted so the new link is embedded.

Embeds

I was wrong about how oEmbed works, oEmbed (WordPress Embed Discovery) is tripped for every plain text URLs pasted into a WordPress Post that’s not within code tags! If a new line starts with http:// or https:// (the feature ignores spaces before http/https) WordPress oEmbed runs a get request on the URL to check if it’s a valid provider. The only exception I’ve found to this is if the URL is within code tags.

Note: The above plain text link to the old WordPress Codex URL hasn’t resulted in any embed code, but it did result in a database entry in the PostMeta table with “meta_key” value “_oembed_ab524b46a7ab7515bf748ba8742ad55d” and “meta_value” value “{{unknown}}“. Looks like the WordPress Codex sub-domain doesn’t support embedding it’s content :-)

The new link is embedded, below is the embed HTML code produced after the code has been minified by the W3 Total Cache Plugin and Lazy Loaded by the Rocket Lazy Load Plugin.

<blockquote class=wp-embedded-content data-secret=hm3C9vZKK7><p><a href=https://wordpress.org/support/article/embeds/ >Embeds</a></p></blockquote><p><iframe loading=lazy title="&#8220;Embeds&#8221; &#8212; WordPress.org Forums" class=wp-embedded-content sandbox=allow-scripts security=restricted style="position: absolute; clip: rect(1px, 1px, 1px, 1px);" src=about:blank data-secret=hm3C9vZKK7 width=500 height=282 frameborder=0 marginwidth=0 marginheight=0 scrolling=no data-rocket-lazyload=fitvidscompatible data-lazy-src="https://wordpress.org/support/article/embeds/embed/#?secret=hm3C9vZKK7"></iframe><noscript><iframe title="&#8220;Embeds&#8221; &#8212; WordPress.org Forums" class=wp-embedded-content sandbox=allow-scripts security=restricted style="position: absolute; clip: rect(1px, 1px, 1px, 1px);" src="https://wordpress.org/support/article/embeds/embed/#?secret=hm3C9vZKK7" data-secret=hm3C9vZKK7 width=500 height=282 frameborder=0 marginwidth=0 marginheight=0 scrolling=no></iframe></noscript>

If I’ve understood this correctly this is what happens to a new plain text URLs that’s not within code tags.

For a Valid oEmbed Provider : YouTube Video For Example

First there’s a head request
HEAD https://www.youtube.com/watch?v=oYtMmSfDCYc
Then a get request
GET https://www.youtube.com/watch?v=oYtMmSfDCYc
Then another get request
GET http://www.youtube.com/oembed?scheme=https&maxwidth=500&maxheight=750&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DoYtMmSfDCYc&format=json
Valid oEmbed iFrame code is added to the WordPress database.

This is a one time occurrence, after the URL is checked the result is effectively cached in the PostMeta table and the cache is checked (the cache is presumably checked first).

For a Site That’s Not a Provider : the old WordPress Codex Link For Example

First there’s a head request
HEAD https://codex.wordpress.org/Embeds
Then a get request
GET https://codex.wordpress.org/
An {{unknown}} oEmbed entry is added to the database.

This is a one time occurrence, after the URL is checked the result is cached in the PostMeta table.

After the oEmbed result is cached in the PostMeta table every time this articles content is accessed (via the_content() or the_excerpt() function for example), WordPress will load the relevant rows from the PostMeta table (the ones related to this posts post_id) so WordPress oEmbed won’t recheck the URL (via a head and get request). I’m not a database expert, but this seems wasteful for telling WordPress to do nothing to some text based content (the URLs generating {{unknown}}) in a post???

And one disturbing feature, the {{unknown}} entries do not have an associated timestamp entry, are these every rechecked? If they are, when? I’ve been manually clearing them for testing.

Recall the general election site earlier, that site has 681 {{unknown}} entries with one post having 24 {{unknown}} entries: those 24 entries are related to plain text URLs added to comments (recall the Stallion Responsive feature that embeds YouTube videos into comments). This is nothing compared to some websites, some have hundreds of thousands of these {{unknown}} entries, though they might be due to bugs or poorly written themes/plugins!

A valid provider appears to be ANY webpage that sends back the correct oEmbed data, so plain text links to any WordPress blog (by default they should send the correct oEmbed data back from a get request) and WordPress will add a link to the webpage etc…. For example adding a plain text link to https://seo-gold.com/title-tag-seo/ on it’s own line (note: adding spaces before the URL are ignored, so no easy HIDDEN way to block this default WordPress behavior) results in this output:

Title Tag SEO Tutorial

Another example with a plain text link to https://md-law.classic-literature.co.uk/30-day-wild-day-6-painted-lady-butterfly/ (one of my other WordPress sites).

https://md-law.classic-literature.co.uk/30-day-wild-day-6-painted-lady-butterfly/

It’s a really nice WordPress feature in theory, but if the plain text link is linking to a non-provider (or 404 error page) it generates one of the {{unknown}} entries in the PostMeta table. The problem is WordPress oEmbed checks almost anything plain text linked to and it shouldn’t: obvious exceptions are links added inside code and pre tags. Code tags are honored, but pre tags aren’t!

Based on testing, some URLs are processed that don’t make sense.
URLs ending .js and .css for example.
Would also makes sense not to check URLs starting http://example.com/, https://example.com/, http://www.example.com/, https://www.example.com/ (and domain.com/ derivatives) since developers tend to use those for showing URLs that do not exist and no one will want those embedding.

As a WordPress theme/plugin developer I tend to post a fair amount of plain text URLs I DO NOT want WordPress to embed (I want it treated as plain text or code: I place it in pre or code tags, IMO WordPress by default should not embed anything in those two tags). For example the Languagenut Review post has lots of plain text URLs in pre/code tags that are processed by oEmbed.

If I clear the oEmbed cache (delete the relevant entries from the database: if you don’t understand databases, the “Disable Embeds WordPress Plugin” can delete all the PostMeta oEmbed entries), upon reloading the post it results in 16 HTTP get requests (see screenshot).

WordPress oEmbed External HTTP Get Requests

WordPress oEmbed External HTTP Get Requests

Of the 16 get requests, 8 are to 404 error pages caused by oEmbed trying to get content from webpages that don’t exist: because the plain text URLs are truncated links (copied from Google’s PageSpeed Insights Tool Results for the Languagenut site) added between code/pre tags, should never be processed as links etc…

Since I use a mix of pre and code tags I had to edit the post and make sure all plain text URLs on their own line were put inside code tags (can’t use pre).

We also need a way to easily disable this default behavior, where I can get away with it I’m adding ## before the URL like this:

## https://seo-gold.com/anchor-text/

What also works as long as the tags are on the same line as the plain text URL:

Strong tag stops single line oEmbed
https://seo-gold.com/why-seo-training-is-essential-for-marketing-professionals/

Em tag stops single line oEmbed
https://seo-gold.com/website-navigation-structure-seo-guide/

Ins tag stops single line oEmbed
https://seo-gold.com/seo-meta-tags/

Code tag stops single line oEmbed
https://seo-gold.com/nofollow-seo-tutorial/

The above tags are all available on the WordPress default Post Edit screen when Text is selected.

Other tags that work as long as the URL is on one line:

Pre tag stops single line oEmbed

https://seo-gold.com/fresh-content-seo/

Span tag stops single line oEmbed
https://seo-gold.com/seo-myth-seo-is-dead/

Problem with all these tags is it only works for a single line, I believe the only exception to this is code tags:

Code tag stops multi-line oEmbed
https://seo-gold.com/marcus-lemonis-website-seo-review/
https://seo-gold.com/camping-world-rv-sales-seo-review/
https://seo-gold.com/nyc-bagel-deli-website-seo-review/

Unfortunately with pre tags plain text URLs are processed by oEmbed:

Pre tag does NOT stop multi-line oEmbed

https://seo-gold.com/cyber-security-and-seo-link-spam/
https://seo-gold.com/seo-tutorial/
https://seo-gold.com/top-5-seo-tips-infographic/

What’s happened above is the first URL isn’t embedded because it’s not on it’s own line (has an opening pre tag at the start: you can’t see it, it’s a tag).

The next line is a plain text URL https://seo-gold.com/seo-tutorial/ on a separate line, so it’s embedded.

The final line is considered on a separate line, but after the URL https://stallion-theme.co.uk/seo-tutorial-for-wordpress/ is a closing /pre tag and the closing tag is considered part of the URL (links to a 404 error page resulting in an oEmbed {{unknown}} entry).

As you can see above adding plain text URLs to pre tags results in a mess.

Update December 2019: need to check if WordPress changed this.

I plan to try to fix this behavior, so here’s a screenshot of what the above looked like at the time of creating this article.

WordPress oEmbed Inside Pre Tags

WordPress oEmbed Inside Pre Tags

Considering these features are developed by WordPress developers who must post plain text URLs in code/pre tags all the time, I’m surprised oEmbed works inside pre tags.

Plain text URLs within pre tags should NOT be processed by oEmbed and there should be at least one ‘hidden’ way to easily add plain text URLs (like span tags) with no additional styling (like code, but without any CSS styling). Span tags work (unless your theme has styling for span tags), but span tags are not on the WordPress Post Edit screen and so not user friendly.

WordPress Embeds and Performance SEO

The oEmbed {{unknown}} entries will have a small negative impact on performance, best advice is avoid posting plain text links on a new line which you do not want embedding and do not want an {{unknown}} entry added to the database.

I plan to go through this website and modify all posts which have {{unknown}} entries and change the posts content (and comments content) so there are no {{unknown}} entries (other than this post as an example).

After making the changes I’ll be manually deleting the relevant rows in the PostMeta table so the oEmbed cache is rebuilt: I’m not certain, but I think the {{unknown}} entries never expire???

Not comfortable with SQL commands, there’s the Oembed Cache WordPress plugin which has an option to delete the entire oEmbed cache at the click of a button.

Embeds in WordPress Comments

In the Stallion Responsive 8.5 code, the embed YouTube videos in comments feature is working as it is supposed to (embeds URLs from sites in the whitelist, YouTube, WordPress TV, Vimeo, Twitter etc…) other than when a URL has been posted in the Post AND a Comment.

When a URL that’s already been cached because it was added to the Post is also added to a Comment (a Comment associated with the Post), the Comment version of the plain text URLs uses the Post oEmbed cache. It’s an edge case (how often will the same plain text URL be added to the Post and a Comment on that post), but I’m not happy with this behavior, so looking for a fix. In the meantime it’s a case of either edit the Post or the Comment so the plain text URL isn’t embedded on one or both of them.

I have improved oEmbed output for Stallion Responsive 8.5. In Stallion Responsive 8.4 there was a catchall YouTube CSS class (called styoutube) which added a div around the embedded iFrame which made the YouTube video responsive in mobile devices. I didn’t realise this class was also added around all embedded content which cut the bottom off of embedded Twitter content for example.

In 8.5 the styoutube class is currently only added around YouTube and Vimeo videos (will modify the function further to cover other video sources before I release Stallion Responsive 8.5).

The new Stallion Responsive 8.5 function so far is

function str_embed_html ( $cache, $url, $attr, $post_id ) {
  if( preg_match('/(youtube.com)/', $cache) ){ // YouTube Video
    return '<div class="styoutube">' . $cache . '</div>';
  } elseif ( preg_match('/(vimeo.com)/', $cache) ) { // Vimeo video
    return '<div class="styoutube">' . $cache . '</div>';
#  } elseif ( preg_match('/(twitter.com)/', $cache) ) { // Twitter Tweet
#    return '<div class="align-center">' . $cache . '</div>';
  } else {
    return $cache;
  }
}
add_filter('embed_oembed_html', 'str_embed_html', 99, 4); // WordPress default
add_filter('video_embed_html', 'str_embed_html', 99, 4); // Jetpack

and the relevant CSS code to make videos responsive:

.styoutube {
position: relative;
padding-bottom: 56.25%; /* 16:9 */
padding-top: 25px;
height: 0;
margin: 5px 0 15px 0;
overflow: hidden;
}
.styoutube iframe {
position: absolute;
top: 0;
left: 0;
width: 100%;
height: 100%;
}

I commented out the Twitter part of the function, instead used a class (called twitter-tweet) that’s part of the Twitter iFrame.

.twitter-tweet, .wp-embedded-content {
width: 100%;
clear: both;
display: block;
margin-left: auto;
margin-right: auto;
}

I shared this post on Twitter and the Twitter Status is at: https://twitter.com/DavidLaw/status/787474418188611584, see the WordPress oEmbed output below.

The class wp-embedded-content is added to links that are embedded.

There’s forty odd URLs WordPress embeds content from, going to have to go through them all to see if they need specific code (I hope not :-)).

WordPress 4.6.1 oEmbed Whitelist
Dailymotion | dailymotion.com
Flickr | flickr.com
Hulu | hulu.com
Photobucket | photobucket.com
Scribd | scribd.com
Vimeo | vimeo.com
WordPress.tv | wordpress.tv
YouTube | youtube.com/watch
Funny or Die | funnyordie.com
Polldaddy | polldaddy.com
SmugMug | smugmug.com
YouTube | youtu.be
Twitter | twitter.com
Instagram | instagram.com
Instagram | instagr.am
Slideshare | slideshare.net
SoundCloud | soundcloud.com
Dailymotion | dai.ly
Flickr | flic.kr
Spotify | spotify.com
Imgur | imgur.com
Meetup.com | meetup.com
Meetup.com | meetu.ps
Animoto | animoto.com
Animoto | video214.com
CollegeHumor | collegehumor.com
Issuu | issuu.com
Mixcloud | mixcloud.com
Polldaddy | poll.fm
TED | ted.com
YouTube | youtube.com/playlist
Vine | vine.co
Tumblr | tumblr.com
Kickstarter | kickstarter.com
Kickstarter | kck.st
Cloudup | cloudup.com
ReverbNation | reverbnation.com
VideoPress | videopress.com
Reddit | reddit.com
Speaker Deck | speakerdeck.com
Twitter | twitter.com/timelines
Twitter | twitter.com/moments

David Law

*
David Law : Technical SEO Expert with 20+ years Online Business, SEO, Search Engine Marketing and Social Media Marketing experience... Creator of multiple WordPress SEO Themes and SEO Plugins. Interests: wildlife, walking, environmental issues, politics, economics, journalism, consumer rights.

Website - SEO Gold Services