ISI Logo

Distinguished Author Series

JPT - Journal of Petroleum Technology, October 2000


2. Evolutionary Computing

Shahab Mohaghegh
Intelligent Solutions, Inc.
West Virginia University

In part one of this series, artificial neural networks were discussed, and a general definition of virtual intelligence was provided. The goal of this article is to provide an overview of evolutionary computing, its potential combination with neural networks to produce powerful intelligent applications, and its applications in the oil and gas industry. The most successful intelligent applications incorporate several virtual intelligence tools in a hybrid manner. Virtual intelligence tools complement each other and are able to amplify each other's effectiveness. In this article, a general overview of evolutionary computation and its background – as related to the Darwinian evolution theory is presented. This is followed by a more detailed look at genetic algorithms as the primary evolutionary computing paradigm that is mostly used. The article will continue and conclude by exploring the application of a hybrid neural network/genetic algorithm system to a petroleum engineering related problem.


Evolutionary computing, like other virtual intelligence tools, has its roots in nature. It is an attempt to mimic the evolutionary process using computer algorithms and instructions. However, why would we want to mimic the evolution process? The answer will become obvious once we realize what type of problems the evolution process solves and whether we would like to solve similar problems. Evolution is an optimization process1. One of the major principles of evolution is heredity. Each generation inherits the evolutionary characteristics of the previous generation and passes those same characteristics to the next generation. These characteristics include those of progress, growth and development. This passing of the characteristics from generation to generation is facilitated through genes.

Since the mid 1960s, a set of new analytical tools for intelligent optimization have surfaced that are inspired by the Darwinian evolution theory. The term "evolutionary computing" has been used as an umbrella for many of these tools. Evolutionary computing comprises of evolutionary programming, genetic algorithms, evolution strategies, and evolution programs, among others. To many people, these tools (and names) look similar and their names are associated with the same meaning. However, these names carry quite distinct meanings to the scientists deeply involved in this area of research. Evolutionary programming, introduced by John Koza2, is mainly concerned with solving complex problems by evolving sophisticated computer programs from simple, task-specific computer programs. Genetic algorithms are the subject of this article and will be discussed in detail in the next section. In evolution strategies3, the components of a trial solution are viewed as behavioral traits of an individual, not as genes along a chromosome, as implemented in genetic algorithms. Evolution programs4 combine genetic algorithms with specific data structures to achieve its goals.

Genetic Algorithms

Darwin's theory of survival of the fittest (presented in his 1859 paper titled On the Origin of Species by Means of Natural Selection), coupled with the selectionism of Weismann and the genetics of Mendel, have formed the universally accepted set of arguments known as the evolution theory3.

In nature, the evolutionary process occurs when the following four conditions are satisfied2:

  • An entity has the ability to reproduce.
  • There is a population of such self-reproducing entities.
  • There is some variety among the self-reproducing entities.
  • This variety is associated with some difference in ability to survive in the environment.

In nature, organisms evolve as they adapt to dynamic environments. The "fitness" of an organism is defined by the degree of its adaptation to its environment. The organism's fitness determines how long it will live and how much of a chance it has to pass on its genes to the next generation. In biological evolution, only the winners survive to continue the evolutionary process. It is assumed that if the organism lives by adapting to its environment, it must be doing something right. The characteristics of the organisms are coded in their genes, and they pass their genes to their offspring through the process of heredity. The fitter an individual, the higher is its chance to survive and hence reproduce.

Intelligence and evolution are intimately connected. Intelligence has been defined as the capability of a system to adapt its behavior to meet goals in a range of environments3. By imitating the evolution process using computer instructions and algorithms, researchers try to mimic the intelligence associated with the problem solving capabilities of the evolution process. As in real life, this type of continuous adaptation creates very robust organisms. The whole process continues through many "generations", with the best genes being handed down to future generations. The result is typically a very good solution to the problem. In computer simulation of the evolution process, genetic operators achieve the passing on of the genes from generation to generation. These operators (crossover, inversion, and mutation) are the primary tools for spawning a new generation of individuals from the fit individuals of the current population. By continually cycling these opera­tors, we have a surprisingly powerful search engine. This inherently preserves the critical balance needed with an intelligent search: the balance between exploitation (taking advantage of information already obtained) and exploration (searching new areas). Although simplistic from a biologist's viewpoint, these algorithms are sufficiently complex to provide robust and powerful search mechanisms.

Mechanism Of A Genetic Algorithm

The process of genetic optimization can be divided into the following steps:

  1. Generation of the initial population.
  2. Evaluation of the fitness of each individual in the population.
  3. Ranking of individuals based on their fitness.
  4. Selecting those individuals to produce the next generation based on their fitness.
  5. Using genetic operations, such as crossover, inversion and mutation, to generate a new population.
  6. Continue the process by going back to step 2 until the problem's objectives are satisfied.

The initial population is usually generated using a random process covering the entire problem space. This will ensure a wide variety in the gene pool. Each problem is encoded in the form of a chromosome. Each chromosome is collection of a set of genes. Each gene represents a parameter in the problem. In classical genetic algorithms, a string of 0s and 1s or a bit string represents each gene (parameter). Therefore, a chromosome is a long bit string that includes all the genes (parameters) for an individual. Figure 1 shows a typical chromosome as an individual in a population that has five genes. Obviously, this chromosome is for a problem that has been coded to find the optimum solution using five parameters.


Figure 1. A chromosome with five genes.

The fitness of each individual is determined using a fitness function. The goal of optimization is usually to find a minimum or a maximum. Examples of this include the minimization of error for a problem that must converge to a target value or the maximization of the profit in a financial portfolio. Once the fitness of each individual in the population is evaluated, all the individuals will be ranked. After the ranking, it is time for selection of the parents that will produce the next generation of individuals. The selection process assigns a higher probability of reproduction to the highest-ranking individual, and the reproduction probability is reduced with a reduction in ranking.

After the selection process is complete, genetic operators such as crossover, inversion and mutation are incorporated to generate a new population. The evolutionary process of survival of the fittest takes place in the selection and reproduction stage. The higher the ranking of an individual, the higher the chance for it to reproduce and pass on its gene to the next generation.


Figure 2. Simple crossover operator.

In crossover, the two parent individuals are first selected and then a break location in the chromosome is randomly identified. Both parents will break at that location and the halves switch places. This process produces two new individuals from the parents. One pair of parents may break in more than one location at different times to produce more than one pair of offspring. Figure 2 demonstrates the simple crossover.

There is other crossover schemes besides simple crossover, such as double crossover and random crossover. In double crossover, each parent breaks in two locations, and the sections are swapped. During a random crossover, parents may break in several locations. Figure 3 demonstrates a double crossover process.


Figure 3. Double crossover operator.

As was mentioned earlier, there are two other genetic operators in addition to crossover. These are inversion and mutation. In both of these operators the offspring is reproduced from one parent rather than a pair of parents. The inversion operator changes all the 0s to 1s and all the 1s to 0s from the parent to make the offspring. The mutation operator chooses a random location in the bit string and changes that particular bit. The probability for inversion and mutation is usually lower than the probability for crossover. Figures 4 and 5 demonstrate inversion and mutation.


Figure 4. Inversion operator.


Figure 5. Mutation operator.

Once the new generation has been completed, the evaluation process using the fitness function is repeated and the steps in the aforementioned outline are followed. During each generation, the top ranking individual is saved as the optimum solution to the problem. Each time a new and better individual is evolved, it becomes the optimum solution. The convergence of the process can be evaluated using several criteria. If the objective is to minimize an error, then the convergence criteria can be the amount of error that the problem can tolerate. As another criterion, convergence can take place when a new and better individual is not evolved within four to five generations. Total fitness of each generation has also been used as a convergence criterion. Total fitness of each generation can be calculated (as a sum) and the operation can stop if that value does not improve in several generations. Many applications simply use a certain number of generations as the convergence criterion.

As you may have noticed, the above procedure is called the classic genetic algorithms. Many variations of this algorithm exist. For example, there are classes of problems that would respond better to genetic optimization if a data structure other than bit strings were used. Once the data structure that best fits the problem is identified, it is important to modify the genetic operators such that they accommodate the data structure. The genetic operators serve specific purposes – making sure that the offspring is a combination of parents in order to satisfy the principles of heredity – which should not be undermined when the data structure is altered.

Another important issue is introduction of constraints to the algorithm. In most cases, certain constraints must be encoded in the process so that the generated individuals are "legal". Legality of an individual is defined as its compliance with the problem constraints. For example in a genetic algorithm that was developed for the design of new cars, basic criteria, including the fact that all four tires must be on the ground, had to be met in order for the design to be considered legal. Although this seems to be quite trivial, it is the kind of knowledge that needs to be coded into the algorithm as constraints in order for the process to function as expected.

Application In The Petroleum Industry

There have been several applications of genetic algorithms in the petroleum and natural gas industry. The first application in the literature goes back to one of Holland's students named David Goldberg. He applied a genetic algorithm to find the optimum design for gas transmission lines5. Since then, genetic algorithms have been used in several other petroleum applications. These include reservoir characterization6 and modeling7, distribution of gas-lift injection8, petrophysics9 and petroleum geology10, well test analysis11, and hydraulic fracturing design12-14.

As it was mentioned earlier, virtual intelligence techniques perform best when used to complement each other. The first hybrid neural network/genetic algorithm application in the oil and gas industry was used to design optimum hydraulic fractures in a gas storage field12-13. A brief review of the hybrid neural network/genetic algorithm is presented here.

Virtual intelligence techniques were utilized to design optimum hydraulic fractures for the Clinton Sand in Northeast Ohio. In order to maintain and/or enhance deliverability of gas storage wells in the Clinton Sand, an annual restimulation program has been in place since the late sixties. The program calls for as many as twenty hydraulic fractures and refractures per year. Several wells have been refractured three to four times, while there are wells that have only been fractured once in the past thirty years. Although the formation lacks detailed reservoir engineering data, there is wealth of relevant information that can be found in the well files. Lack of engineering data for hydraulic fracture design and evaluation had, therefore, made use of 2D or 3D hydraulic fracture simulators impractical. As a result, prior designs of hydraulic fractures had been reduced to guesswork. In some cases, the designs were dependent on engineers' intuition about the formation and its potential response to different treatments – knowledge gained only through many years of experience with this particular field. The data set used in this study was collected using well files that included the design of the hydraulic fractures. The following parameters were extracted from the well files for each hydraulic fracture treatment: the year the well was drilled, total number of fractures performed on the well, number of years since the last fracture, fracture fluid, amount of fluid, amount of sand used as proppant, sand concentration, acid volume, nitrogen volume, average pumping rate, and the service company performing the job. The match up between hydraulic fracture design parameters and the available post fracture deliverability data produces a data set with approximately 560 records.

The first step in this study was to develop a set of neural network models of the hydraulic fracturing process in the Clinton Sand. These models were capable of predicting post fracture deliverability given the input data mentioned above. Figure 6 shows the neural model's predictions compared to actual field results for three years. Data from these years were not used in the training process.

Figure 6. Neural network model's predictive capability in the Clinton Sand.

Once the neural network model's accuracy was established, it was used as the fitness function for the genetic algorithm process to form the hybrid intelligent system. The input data to the neural network can be divided into three categories:

  • Basic well information
  • Well production history
  • Hydraulic fracture design parameters such as sand concentration, rate of injection, sand mesh size, fluid type, etc.

hydraulic fracture design parameters) are among the controllable parameters. In other words, these are the parameters that can be modified for each well to achieve a better hydraulic fracture design. A two-stage process was developed to produce the optimum hydraulic fracture design for each well in the Clinton Sand. The optimum hydraulic fracture design is defined as the design that results in the highest possible post fracture deliverability. Figure 7 is a schematic diagram of the hybrid neuro-genetic procedure.

The neural network for the first stage (neural module #1) is designed and trained to perform a rapid screening of the wells. This network is designed to identify the so-called "dog wells" that would not be enhanced considerably even after a frac job. This way the genetic optimization can be concentrated on the wells that have a realistic chance of deliverability enhancement. The second stage of the process is the genetic optimization routine. This stage is performed on one well at a time. The objective of this stage is to search among all the possible combinations of design parameters and identify the combination of the hydraulic fracture parameters for a specific well that results in the highest incremental post fracture deliverability.

This second stage process (the genetic optimization routine) starts by generating 100 random solutions. Each solution is defined as a combination of hydraulic fracture design parameters. These solutions are then combined with other information available from the well and presented to the fitness function (neural network). The result from this process is the post fracture deliverability for each solution. The solutions are then ranked based on the highest incremental enhancement of the post fracture deliverability. The highest-ranking individuals are identified, and the selection for reproduction of the next generation is made. Genetic operations such as crossover, inversion and mutations are performed, and a new generation of solutions is generated. This process is continued until a convergence criterion is reached.

This process is repeated for all the wells. The wells with highest potential for post fracture deliverability enhancement are selected as the candidate wells. The combination of the design parameters identified for each well is also provided to the operator to be used as the guideline for achieving the well's potential. This process was performed for the wells in Figure 6. The result of the genetic optimization is presented in Figures 8, 9, and 10. Since the same neural networks have generated all the post fracture deliverabilities, it is expected that the post fracture deliverabilities achieved after genetic optimization have the same degree of accuracy as those that were predicted for each well's field result. In these figures, the green bars show the actual PFD of the wells achieved in the field. The red bars show the accuracy of the neural network used as the fitness function in the genetic optimization routine when it is predicting the PDF, given the design parameters used in the field. The blue bars show the PDF resulting from the same neural network that produced the red bars, but with the input design parameters the genetic algorithms proposed.

Figure 7. Hybrid neuro-genetic procedure for optimum hydraulic fracture design in the Clinton Sand.

Figure 8. Enhancement in PFD if this methodology had been used in 1989.

Please note that the process indicates that some wells can not be enhanced, regardless of the modification in the fracture design, while other wells can be enhanced significantly. This finding can have important financial impact on the operation and can help the management make better decisions in allocation of investment.

Figure 9. Enhancement in PFD if this methodology had been used in 1990.

Figure 10. Enhancement in PFD if this methodology had been used in 1991.

In another application, genetic algorithms were used in combination with neural networks to develop an expert hydraulic fracture designer14. The intelligent system developed for this purpose is capable of designing hydraulic fractures in detail, providing the pumping schedule in several stages (or in a ramp scheme), and identifying the fluid type and amount, proppant type and concentration, and the pumping rate. It was shown that fracture designs proposed by this intelligent system are comparable to those designed by expert engineers with several years of experience.

Closing Remarks

Evolutionary computing paradigms provide a rich and capable environment to solve many search, optimization, and design problems. The larger the space of the possible solutions, the more effective the use of these paradigms. Evolutionary computing, in general, and genetic algorithms, specifically, are able to combine the exploration characteristics of an effective search algorithm with a remarkable ability of preserving and exploiting the knowledge acquired during every step of the search as a guide to take the next step. This provides an intelligent approach to more efficiently solve search, optimization, and design problems.

The domain expertise is a necessity in generating useful tools using evolutionary computing. Domain expertise must be combined with a fundamental understanding of intelligent systems and experience in applying them in real life problems. This will produce intelligent tools that industry can use to help its bottom line.


  1. Mayr, E., "Toward a new Philosophy of Biology: Observations of an Evolutionist", Belknap Press, Cambridge, Massachusetts. 1988.
  2. Koza, J. R., "Genetic Programming, On the Programming of Computers by Means of Natural Selection", MIT Press, Cambridge, Massachusetts. 1992.
  3. Fogel, D. B., "Evolutionary Computation, Toward a New Philosophy of Machine Intelligence", IEEE Press, Piscataway, New Jersey, 1995.
  4. Michalewicz, Z., "Genetic Algorithms + Data Structure = Evolution Programs", Springer - Verlag, New York, New York. 1992.
  5. Goldberg, D. E., "Computer Aided Gas Pipeline Operation Using Genetic Algorithms and Rule Learning", Ph.D. dissertation, University of Michigan, Ann Arbor, Michigan. 1983.
  6. Guerreiro, J.N.C., et. al., Identification of Reservoir Heterogeneities Using Tracer Breakthrough Profiles and Genetic Algorithms, SPE 39066, Latin American and Caribbean Petroleum Engineering Conference and Exhibition held in Rio de Janeiro, Brazil, 30 August-3 September, 1997
  7. Sen, M.K. et. al., Stochastic Reservoir Modeling Using Simulated Annealing and Genetic Algorithm, SPE 24754, SPE 67th Annual Technical Conference and Exhibition of the Society of Petroleum Engineers held in Washington, DC, October 4-7, 1992.
  8. Martinez, E.R. et. al., Application of Genetic Algorithm on the Distribution of Gas-Lift Injection, SPE 26993, SPE 69th Annual Technical Conference and Exhibition held in New Orleans, LA, U.S.A., 25-28 September, 1994.
  9. Fang, J.H., et. al. Genetic Algorithm and Its Application to Petrophysics, SPE 26208, UNSOLICITED, 1992.
  10. Hu, L. Y., et. al., Random Genetic Simulation of the Internal Geometry of Deltaic Sandstone Bodies, SPE 24714, SPE 67th Annual Technical Conference and Exhibition of the Society of Petroleum Engineers held in Washington, DC, October 4-7, 1992.
  11. Yin, Hongjun, Zhai, Yunfang, An Optimum Method of Early-Time Well Test Analysis -- Genetic Algorithm, SPE 50905, International Oil & Gas Conference and Exhibition in China held in Beijing, China, 2-6 November, 1998
  12. Mohaghegh, S., Balan, B., Ameri, S., and McVey, D.S., A Hybrid, Neuro-Genetic Approach to Hydraulic Fracture Treatment Design and Optimization, SPE 36602, SPE Annual Technical Conference and Exhibition held in Denver, Colorado, 6-9 October, 1996.
  13. Mohaghegh, S., Platon, V., and Ameri, S., Candidate Selection for Stimulation of Gas Storage Wells Using Available Data With Neural Networks and Genetic Algorithms, SPE 51080, SPE Eastern Regional Meeting held in Pittsburgh, PA, 9-11 November, 1988.
  14. Mohaghegh, S., Popa, A. S. and Ameri, S.: "Intelligent Systems Can Design Optimum Fracturing Jobs", SPE 57433, Proceedings, 1999 SPE Eastern Regional Conference and Exhibition, October 21-22, Charleston, West Virginia. A summary of this article appeared in the October 1999 issue of JPT page 30.