Mar 20, 2017

Machine Learning in Cyber Security Domain - 9: Botnet Detection

Botnet means an organized automated army of zombies which can be used for creating a DDoS attack as well as spammy actions of flooding any inbox or spreading the viruses. Actually, this army consists of a large number of computers. Attackers use this army for malicious purposes and generally, zombies are not even aware of that they are used for malicious purposes.

Zombies have been used extensively to send spam mail; as of 2005, an estimated 50–80% of all spam was sent by zombie computers worldwide. This allows spammers to avoid detection and presumably reduces their bandwidth costs since the owners of zombies pay for their own bandwidth. General structure about botnet attacks is given below.

Mar 13, 2017

Machine Learning in Cyber Security Domain - 8: Spam Filter

Spam mail (also known as Junk Mail) is a type of electronic spam where unsolicited messages are sent by email. Many email spam messages are generated for commercial purpose in general but it may also contain malicious content which looks like a popular website, but in fact, it may be a phishing attack. Malicious content may include malware, scripts or executable file attachments. Actually, when the user recognizes a spam mail, he/she can add that mail source to a blacklist easily, but some emails are created professionally and most of the time it can't be recognized easily as spam for standard users. For this case, every mail service producer uses spam filter applications which are developed with machine learning techniques. One of the most commonly known algorithm for spam detection is Naive Bayes algorithm which is based on statistical approach. In this section, we will explain how Naive Bayes algorithms works.

Mar 6, 2017

Machine Learning in Cyber Security Domain - 7: IDS/IPS with ML



Intrusion Detection and Intrusion Prevention Systems (IDS / IPS) basically analyze data packets and determine whether it is an attack or not. After analyzing part, the system is able to take some precautions according to the result. IDS/IPSs can be considered as two main categories based on operational logic; (1) Signature Based IDS, (2) Anomaly Based IDS.
Signature Based IDS works with attack signature which is created with the information of known vulnerabilities. Signatures contain detailed information about attacks. This type of systems has high accuracy rate for known attacks, but they cannot detect unknown attacks.

Feb 27, 2017

Machine Learning in Cyber Security Domain - 6: False Alarm Rate Reduction

In some cases, IDS / IPS Systems may classify an event correctly or falsely. Classified events are evaluated in four categories in literature.
  1. True Positives (TP): intrusive and anomalous,
  2. False Negatives (FN): Not intrusive and not anomalous,
  3. False Positives (FP): not intrusive but anomalous,
  4. True Negatives (TN): Intrusive but not anomalous.
TP and FN represent correctly classified events, FP and TN represent wrongly classified events. Recognizing TN (intrusive but not anomalous) is a very hard task and can not be detected by the system itself, human factor must be involved to the mechanism for recognizing this type of events. FP (not intrusive but anomalous) is an event classified as intrusive but it is actually a normal user’s event. This is a very common occurrence in today’s systems. False alarm rate reduction is a one of the challenging problem for especially IDS / IPS system which has been used for commercial purpose.

Feb 20, 2017

Machine Learning in Cyber Security Domain - 5: Captcha Bypassing


Before we explain how captcha mechanism can be bypassed, we want to give you a brief introduction about what captcha mechanism is and how it works.
The main purpose of captcha mechanism is to provide secure authentication for users with asking some questions which are easy for human, however tough for bots. It is imperative to render the process of solving a captcha challenge as effortless as possible for legitimate users, while remaining robust against automated solvers. Thus, bots can not try to enter systems automatically.
Firstly created mechanism was using single image and want to enter numbers or characters which are located in this image to a textbox. Sample images are given in the right of the paragraph. Maybe you have already noticed that line noise in the images. There is a reason for the presence of those lines. Digital numbers or characters are detectable easily using image processing techniques. Therefore images which are used in captcha mechanism are transformed to more complex type in order to make it more difficult to break. First times, these transformations are done by adding noise to image. Such images are given below.

Jan 31, 2017

Machine Learning in Cyber Security Domain - 4: Secure User Authentication


As a dictionary term, Authentication ( or Verification) is independent procedures that are used together for checking that a product, service, user or system meets requirements and specifications and that it fulfills its intended purpose. User verification is a mechanism which gives permission to user to log in applications or systems. No one else can access to user account except real user, in ideal systems.  In general, username and password are used for authentication to systems when the target system is an online service. These fields are vulnerable to brute force attacks, if no preventive measures are taken. Attackers are able to try all combinations to crack user’s passwords (trial and error). It is strongly recommended to use secure passwords which have numbers, letters, and special characters and also have minimum length. Security-conscious companies maintain password creation policies to make sure that every employee’s password is safe. If a user takes this precautions, cracking his/her password may take years through online brute force. Security-aware companies store user passwords in database in hash format, thus even if their systems are hacked, passwords can not be cracked. Of course hash algorithm which is used must be strong, such as adaptive hash algorithms (bcrypt). Beside these precautions some additional security mechanisms are used to prevent unauthorized access to systems such as captcha and two-factor authentication.

Jan 23, 2017

Machine Learning in Cyber Security Domain - 3: Fraud Detection

Fraud is one of the ancient thing in human history. As there is always people who is fraudulent, there is also people who defrauded. The money e.g. credit cards are well-known targets for being targeted by fraudulent activities. With the development of e-marketing sector, the count of fraudulent activities are rising day by day. Users credit cards informations stored in some companies’ databases, such company types as banks, online shopping companies or online service providers. We witness a growing presence of frauds on online transactions with the widespread use of internet day by day. As a consequence of this, the need of automatic systems which able to detect and fight fraudster is emerged.