How to get links from hot linkers using Apache logs

Unless you have taken steps to prevent hot linking people are probably directly linking to images on your site, this is a bit cheeky as you are technically hosting content for them… But instead of blocking their access why not contact them and request a credit link?

Most of you will be running Linux servers, so here is a quick one liner that will parse the Apache access log and give you a list of hot linking domains.

You will need root level shell access… Or you could ask your admin to run the command and mail you the list. 

[root@thor ~]# awk -F\" '($2 ~ /\.(jpg|gif|png)/ && $4 !~ /^http:\/\/www.\lionseo\.com/){print $4}' /var/log/httpd/access_log | sort | awk '{ if (a[$1]++ == 0) print $0; }' "$@" | sort


The location of the Apache access log may vary depending on the Linux distribution or hosting environment you are using (the above is for a standard RHEL system).

To email yourself a list that you can copy and paste into Excel enter the following:

[root@thor ~]# awk -F\" '($2 ~ /\.(jpg|gif|png)/ && $4 !~ /^http:\/\/www.\lionseo\.com/){print $4}' /var/log/httpd/access_log | sort | awk '{ if (a[$1]++ == 0) print $0; }' "$@" | sort | mail

The above command assumes you have mail correctly configured on your Linux server.

Now go ask for credit links to your images, something like this works quite well…

Request Credit Link Outreach Email

To save time I would use Canned Response for the credit link request template and Boomerang (which is awesome) to remind me if they have not responded to my request.

If you are evil and they don’t respond  after a couple of emails you could block their domain from hot linking in my .htaccess with:

RewriteEngine On
RewriteCond %{HTTP_REFERER} ^http://(.+\.)?domain1\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(.+\.)?domain2\.com/ [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(.+\.)?domain3\.com/ [NC]
RewriteRule .*\.(jpe?g|gif|bmp|png)$ - [F]