What is REGEX



The term “RegEx” is the short form of “Regular Expression” and can be translated into German as “Regular Expression.” The term comes from computer science and is used in practice to treat or treat certain character strings, for example in a text check.

Such a string comparison is often not visible to the user. If, for example, a search is carried out in a text or a new password is assigned, RegEx is often behind it. A "regular expression" is used, for example, when a password policy is stored in software that specifies certain combinations of characters for passwords. For such a password rule, the expression can look like this:

(? = ^. {8,} $) ((? =. * \ D) | (? =. * \ W +)) (?! [. \ N]) (? =. * [AZ]) (? =. * [az]). * $ "

There are numerous specifications in this rule, for example the minimum length of 8 characters and the use of upper and lower case letters. For example, the expression means that any character (symbolized by the period) should appear eight times or more ().

Illustration: RegEx example, author: Seobility

As the example shows, the types of characters and their functions in RegEx are complex and equally powerful.

Characters and patterns - components of a RegEx

Individual characters describe functions in a regular expression. A set of such characters results in the so-called pattern. This pattern is applied to a character string in a text and returns whether this character string fulfills the specifications in the pattern or not. After this test, any function can be linked to both positive and negative fulfillment. A regular expression is therefore often used in a software condition.

In practice, this is particularly relevant for finding and replacing certain character strings. Searching for and replacing character strings is particularly important in standard server administration tools such as "sed". The use of regular expressions has been provided here since development began.

In most programming languages, but also in tools for processing text, a regular expression can also be used to implement requirements. In this way, complex functionalities can be replaced by short regular expressions and their handling.

Concatenation of regular expressions

RegEx expressions can be concatenated and thus reflect complex requirements for a character string. There are so-called operators that connect the regular expressions with one another. This includes, for example, the description of the alternative, which begins with "|" is implemented. It corresponds to the OR operator from logic.

The "and" connection known from logic is called concatenation or concatenation in regular expressions and is identified as an operator by simply stringing expressions together.

By linking different RexEx expressions in this way, complex query mechanisms can be implemented with just a few characters.

Quantifiers in a regular expression

Within a regular expression there are further operators that have predefined meanings, above all the quantifiers. They specify conditions for an expression that affect the number of a specific character or character string. For example, a question mark (?) Means that the preceding expression should appear either zero times or exactly once in a text. The wildcard character (*), on the other hand, denotes any number of occurrences of the preceding expression. In between, all combinations of mandatory or mandatory occurrences of certain character strings can be defined in an expression. The character combination {n, m} is available for this. If n or m is left blank in an expression, this corresponds to the designation "any", n stands for the minimum of the number of characters, m for the maximum. This generic approach makes it possible to implement every requirement in a regular expression, regardless of its practical meaning. However, the use of regular expressions in practice requires a high level of abstraction.

Advanced expressions

Once you have internalized the principles of regular expressions, you can write simple expressions and use them safely. However, regular expressions enable a lot more. For example, searches can be canceled conditionally or groups of fonts can be referenced backwards. Since the logic of regular expressions is not always intuitive for humans, the safe use of RegEx semantics is reserved for experts in the field. However, if you master the regular expressions, numerous requirements can be covered. The use of language enables, for example, simple web server administration or processing of text.

Example of a regular expression in web server administration

An example for the use of regEx expressions is the module mod_rewrite for the Apache web server. It is used to redirect incoming URL requests to the corresponding area of ​​a website. The entire URL including all sub-data and parameters is considered.

The rules are stored in the .htaccess file of the web server. A statement that can often be found there is:

RewriteEngine on RewriteRule (. *) \. Html $ /cgi-bin/script.pl?var=$1

This expression means that all requests to an HTML file are directed to a script, with the requesting URL being passed as a parameter. The syntax of this rule is based on regular expressions, which enable complex control of a web server.

Importance of regular expression for SEO

The module “mod_rewrite” described above is of great importance for search engine optimization, as it enables complex dynamic URLs to be rewritten into “speaking” URLs. Such simple URLs are important because they help users and search engines to understand the content structure and hierarchy of a website. For this reason, knowledge of regular expressions is also advantageous in the area of ​​search engine optimization.

Related Links

Similar articles

To quote the article, just copy this link: