Classification of content filters
Classification of pornography filtering solutions
1. Based on implementation
2. Based on technology
Pornography filtering solutions can be classified on implementation and on technology.
Classification based on implementation
There are three types of implementations possible.
1. PC based
2. Network based – transparent
3. Network based – proxy
PC Based implementation
In this category, software is installed on the target PC and the software controls the browsing of the end user.
This is customizable for individual’s usage. Has features like password protection, stealth mode etc.
Network based - transparent
In this category, network traffic passes through the device/computer, which allows/blocks individual packet based on
There is another type of implementation possible in this category. This is generally known as pass-by implementation.
The traffic does not pass through the device but passes by it. The device sniffs the traffic and sending tcp reset to the server does blocking.
Network based - proxy
In this category, all network traffic goes through a proxy server. I.e. all tcp traffic will go through the device and all connections originate from this device.
This severely intrudes/disturbs network topology and bandwidth management/ shaping.
Classification based on technology
There are three types of content filters
1. Domain/URL based filtering
2. Keyword based filtering
3. Pattern based filtering
Domain/URL based filtering
In this type of content filtering, a database of negative domains and/or URLs is maintained. There usually a team of people manually editing the database all the time. Generally these content filters come with periodic subscription fee to maintain the team.
An extension to this is that the list itself is generated dynamically and final update is done manually.
Websites of these service providers generally have means to report false negatives and false positives. There is also provision for website owners to request for removal from the list. All these activities are handled manually.
This is based on the premise that there could be no filter capable of handling all cases effectively.
Until arrival of NetOptima, this statement is true.
Keyword based filtering
In this type of content filtering, a keyword database is generated and web pages are dynamically checked if they contain these keywords. If keyword is found in the page being rendered, the page is blocked.
If CNN website news says that “sex discrimination ruling” “man convicted for child pornography” the news is blocked.
So are the websites like sensex.in and scores of other such websites.
This actually explains why the network devices to block pornography are bulky. Like rack mounted dual processor boards just to handle 1000 customers.
There are some solutions, which take halfway route by having both keywords based and domain/URL based searches and dynamically updating themselves.
Most of this software sit on user’s computer and eat the computers CPU cycles or else are bulky as explained before.
Pattern based filtering
In pattern based searching, the web page is analyzed while it is being loaded for pre-defined patterns. These patterns reveal so much about the web pages that it is possible to accurately assign probability to each page of the page being pornographic.
People would have thought about such process. But then if the first two are taking so much of processing power that it would be unimaginable to have a product based on statistical filtering that is feasible.
NetOptima falls in the category of pattern based filter.
Pattern matching is more expensive than matching domain/URL and keywords. It would take too much of processing power.
But that is for others. NetOptima has very lightweight but complex pattern matching that is almost impossible to achieve. Simple solutions are expensive (computationally) and smart & complex solutions are inexpensive.