HTML to Markdown Python

Introduction

Markdown, a simple markup language, enables users to compose content that appears organized and easy to comprehend on the internet. Conversely, online content needs to be formatted and presented with HTML. Converting HTML content to Markdown can be beneficial, especially when you want to structure the content or enhance its readability.

Utilizing the markdownify module in Python offers a convenient method for converting HTML to Markdown. This tool streamlines the process of converting text from Markdown to HTML, ensuring efficiency and accuracy. Prior to initiating the conversion process, it is essential to install the markdownify package in your Python environment. Once the package is successfully downloaded and installed, you can seamlessly import it into your Python script and leverage its functions to facilitate the transformation of HTML content into Markdown format.

Installation

You must install it independently of the Python package since Python does not include it automatically. The subsequent command is used for installing the module and should be entered in the terminal.

Example

pip3 install markdownify

There are multiple steps involved in the process of using Python to convert HTML text to Markdown, as listed below.

  • Import module: The markdownify module must be included in your Python script as the first step. This module incorporates a multiplicity of utilities, including converting HTML to Markdown.
  • Create HTML text: It is there that the HTML text that you intend to transform to Markdown text is produced. You have two options: writing down the content by hand or downloading the content using Python libraries, such as requests inside a file or from a webpage.
  • Use the markdownify function and send the text to it: Using the markdownify method provided by the markdown module, you can transform HTML text to Markdown when you've obtained it. This method accepts an HTML text as input and outputs the corresponding Markdown content.
  • Display markdowned text: Finally, the Markdown text can be seen in the console or saved to a file by utilizing the built-in Python routines.

The basic procedure involves importing the required module, inputting the HTML text, and then running it through the markdownify function to generate the Markdown version. This particular method can be valuable when converting HTML content to Markdown for improved readability and formatting.

Example 1: Converting HTML to Markdown

Let's now focus on the code responsible for converting standard HTML to markdown format.

Inspect the code snippet provided here. The initial import in this script is the markdownify module. Subsequently, a block of HTML content is crafted, which will be dynamically converted into Markdown format. The emphasized portion showcases an initial HTML heading and a corresponding paragraph.

Next, we advance the HTML text to Markdown syntax using the markdownify function. This function generates the necessary Markdown text and outputs it immediately upon receiving the HTML content as its input.

Example

Finally, we align this with the print statement to showcase the converted Markdown. This outcome represents the translated Markdown corresponding to the original HTML input.

Main.py

Example

import markdownify
ht = "<h1>HTML to Markdown Python</h1> 
<p>This is a demonstration code of converting HTML to Markdown in Python</p>"
mt = markdownify.markdownify(ht)
print(mt)

Output

Example 2

Let's examine another instance of rather intricate HTML code. Take a look at the code provided hereafter.

Main.py

Example

import markdownify
ht = """
<div class="article">
   <h1>HTML to Markdown Python</h1> 
<p>This is a demonstration code of converting HTML to Markdown in Python</p>
   <ul>
      <li>Item 1</li>
      <li>Item 2</li>
      <li>Item 3</li>
   </ul>
   <a href="https://logic-practice.com">Link to C# Tutorial/a>
</div>
"""
mt = markdownify.markdownify(ht)
print(mt)

Output

Conclusion

In summary, using Python to transform HTML into Markdown can serve as a beneficial approach for structuring and presenting content online. This process involves utilizing the markdownify library, which facilitates the swift conversion of HTML content into Markdown syntax.

Input Required

This code uses input(). Please provide values below: