1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to force nofollow rel on external post content URLs?

Discussion in 'General Troubleshooting' started by seekyt, Feb 4, 2012.

  1. seekyt

    seekyt Donor Donor

    This may not appeal to all Hotaru users if you use "Dofollow" as a way to increase membership on your website. Also, I realize I might be asking for a complicated solution here.

    I would like dynamically insert a "rel=Nofollow" tag on all external links inside of a post. This is easy to do for the Title link, but I'm interested in doing this for any outbound url inside of $h->post->content;

    I want to do this for SEO reasons, so using a javascript string will not be suitable (although it's easy to do it this way). I have found three possible solutions from searching the internet: regex method, xml parser method, DOM parser method.

    This is the regex:

    PHP:
    <?php $html preg_replace_callback("#<a\s[^>]*href="(http://[^"]+)"[^>]*>#",
         
    "cb_ext_url"$html);

    function 
    cb_ext_url($match) {
        list (
    $orig$url) = $match;
        if (
    strstr($url"http://localhost/")) {
            return 
    $orig;
        }
        elseif (
    strstr($orig"rel=")) {
            return 
    $orig;
        }
        else {
            return 
    rtrim($orig">") . ' rel="nofollow">';
        }
    }

    ?>
    This is the XML method:
    PHP:
    <?php

    $xmlString 
    "This is where the HTML of your site should go. Make sure it's valid!";

    $xml = new SimpleXMLElement($xmlString);

    foreach(
    $xml->getElementsByTagName('a') as $a)
    {
      
    $attributes $a->attributes();

      if (
    isThisExternal($attributes['href']))
      {
        
    $a['rel'] = 'nofollow';
      }
    }

    echo 
    $xml->asXml();

    ?>
    And this is the DOM method I have crudely put together

    Code:
    $html = str_get_html('$h->url');
    
    $html->find('a',1)->rel = 'nofollow';
    
    echo $html;
    
    I've tried the XML and regex methods but I run into problems which result in the post content not loading, and the error log showing "unexpected character" errors. I assume that this is user error and not code error since these solutions have worked for others before (stackoverflow research).

    I'm pretty sure that the dom method is not going to work because I don't know much about how to use it.

    Any ideas?
     
    Last edited: Feb 4, 2012
  2. PuckRobin

    PuckRobin New Member

    The easiest way is to add it to bookmarking templates. For example in bookmarking_post.php replace:
    PHP:
    echo nl2br($h->post->content);
    with:

    PHP:
    echo preg_replace_callback("#<a\s[^>]*href="(http://[^"]+)"[^>]*>#",
         
    "cb_ext_url"nl2br($h->post->content));
    Of course you need to add your cb_ext_url function to funcs.strings.php or to index.php of the theme.

    Harder way is to make a plugin using post_read_post hook.
     
  3. seekyt

    seekyt Donor Donor

    Thank you, PuckRobin!

    For anyone who is interested, the next two posts reveal the whole solution:

    I have had to change the original regex a little bit because the other one was poorly written and didn't work.
    Edited line in bookmarking_post.php:

    PHP:
    <?php echo preg_replace_callback('#\bhttps?://[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#',
         
    'cb_ext_url', ($h->post->content))?>
    (I do not need nl2br, but you might)


    See next post for what goes in your theme file:
     
    Last edited: Feb 5, 2012
  4. seekyt

    seekyt Donor Donor

    SOLVED

    Updated:

    The correct syntax in index.php:

    PHP:
    <?php 

    function cb_ext_url($match) {
        list (
    $orig$url) = $match;
        if (
    strstr($url"http://example.com/")) {
            return 
    $orig;
        }
        elseif (
    strstr($orig"rel=")) {
            return 
    $orig;
        }
        else {
            return 
    rtrim($orig">") . '"rel="nofollow';
        }
    }

    ?> 
     
    Last edited: Feb 5, 2012
  5. PuckRobin

    PuckRobin New Member

    First of all this is wrong:
    PHP:
    return rtrim($orig">") . ' "rel=nofollow"';
    It should be rel="nofollow". And I didn't understand why rtrim is there and why regex is so complicated :confused:

    Just analyse this WP plugin ($site_link: BASEURL):
    http://wordpress.org/extend/plugins/external-nofollow/
     
  6. seekyt

    seekyt Donor Donor

    The output from:

    PHP:
    return rtrim($orig">") . '"rel="nofollow';
    Is

    HTML:
    rel="nofollow"
    I'm also confused as to why! I edited my previous post several times because I was getting a lot of errors with the original code.

    The only problem with this function is that now all URLs are converted to nofollow, not just external URLs. I'm not sure how to fix the function :p
     
  7. PuckRobin

    PuckRobin New Member

    if (!strstr($url, "http://")) return $orig; or if (stripos("http://",$url) ===false) return $orig;

    will help.
     
    seekyt likes this.
  8. seekyt

    seekyt Donor Donor

    I was finally able to figure it out based on your previous comment. Here is what finally worked:

    PHP:
    <?php 

    function cb_ext_url($match) {
        list (
    $orig$url) = $match;
        if (
    strstr($orig"http://sitename.com/")) {
            return 
    $orig;
        }
        elseif (
    strpos($orig"rel=")) {
            return 
    $orig;
        }
        else {
            return 
    rtrim($orig">") . '"rel="nofollow';
        }
    }

    ?> 
    The typo was in the first if statement. I replaced $url with $orig and it worked beautifully.

    Thanks again!
     
  9. mabujo

    mabujo Designer

    You might think you would want to do this for SEO reasons, but you are almost certainly wrong.
     
  10. PuckRobin

    PuckRobin New Member

    It makes sense in terms of SEO. If you have no or few original content and lots of external dofollow links, Google will regard the website as a "link farm".
     
  11. mabujo

    mabujo Designer

    And nofollow changes that how?
     
  12. PuckRobin

    PuckRobin New Member

    A nofollow link means "Do not account for this link". So if the purpose was producing a link farm, you should have been using dofollow links. Google is smart enough to understand this.

    BTW why don't you explain why this is wrong?
     
  13. mabujo

    mabujo Designer

    ->
    Is unproven. My point is if you have shitty content and nofollowed links it ain't much different from having shitty content and dofollowed links. Shitty content is its own negative ranking factor, it's not the linking or absence of linking that changes anything. "Google is smart enough to understand this" too.

    It makes sense to maybe have links nofollowed, but perhaps you remove the nofollow on links which reach a certain amount of votes (as then you will be linking out to only sites relevant to yours, which is a positive trust signal).
     
  14. PuckRobin

    PuckRobin New Member

    What have you wrote above does not explain why you said using nofollow "is wrong" by itself. Maybe you meant "is not sufficient" or "useless if you have shitty content"? If that is what you meant, I completely agree.
     
  15. seekyt

    seekyt Donor Donor

    Forcing nofollow on links should deter some of the people who only sign up at my site to try and backlink content with low quality content - which as a whole will help SEO because less of this garbage will slip through. I moderate submissions, and spammers cause quite a headache, so this is a great deterrent for some of these people.

    I do not use hotaru for social bookmarking, so having "shitty" content is not necessarily a problem for me. The problem is from members who do not understand SEO, and link out to other sites using anchor text they shouldn't be using - which causes them to lose authority for that particular keyword. I actually require a minimum of a 2,000 character, completely unique article on my hotaru site, which is checked for plagiarism, spun content, quality, linking, etc.
     
  16. mabujo

    mabujo Designer

    If you use the link bar plugin your site isn't really linking out at all and it still doesn't stop people submitting links. They generally don't check, are automated or both.
     

Share This Page