I am currently preparing a requirements document for a new in-house web-based project. While looking at various similar websites I needed to know what technologies were being used to develop them.
My methods:
A server's Web Technology is anything from the OS, Web Server, App Server, CDN, etc to the Server's Scripting language, language Framework, CMS, Client-side technologies, etc.
Sometimes, all you need to find is which language the site is coded in. This can usually be decoded from the URL in the address bar (.php, .aspx, .py). But this method becomes harder when the site uses "Clean URLs" or "SEO-friendly" URL scheme which emit webpage addresses without filename extensions or even without filenames. Even here, sometimes, it's possible to look in the anchor tags and form tags to get the "actual" URL with the file extension.
If you need to know the CMS used, you will automatically also find out the language used for the site. For example, if the site uses Wordpress as the CMS, you can be assured that the site is in PHP. Knowing each CMS' naming conventions for variables, files, etc can help here. Actually finding the CMS used in the site can be as easy as looking at the webpage for the CMS's logo or checking the HEAD Meta tags to looking at the element naming conventions in the JS & CSS resource files (Wordpress names start with "wp-") or checking the comments in the HTML/CSS/JS code.
Finding the Framework used in a site can be tricky sometimes because frameworks are meant to be highly customizable and may not contain default code that can be used to identify them. Still, checking the HEAD Meta tags and code comments can be useful and is the first thing I always do.
Identifying the client technologies in use, is of course, trivial since you have the source code right there in your browser. Javascript frameworks like JQuery, MooTools, Yahoo YUI can all be identified easily by looking at the JS files.
Tools:
Some of the information listed above is anyway sent by most of the websites to your browser as HTTP Headers. There are extensions for all major browsers that allow you to see these headers. There are also web-based tools(Web-Sniffer) for this.
Of course, a website may opt not to send these information via the headers and we have to fallback to the old and tedious methods of getting these information.
Quite recently though, I have found an awesome tool that finds out all the information listed above, and more!
To someone like me who used to go through each site's code to get these information, builtwith is like a magical panacea. Another site that I like in similar vein is CMS Detector, which, in spite of its name, detects more than just the CMS used in a website.
In truth though, a website can ask sites like builtwith from displaying its information to the public. In which case, you are back to hacking through the spaghetti code jungle that is the website's code and come up with the answers the old fashioned way.
No comments:
Post a Comment