Monday, April 11, 2011

Pattern as Hypothesis

In some sense, many discoveries are based on identification of a novel pattern - someone has an AHA moment and realizes that a piece of data or observation is related to another. Perception can be reframed with this new knowledge. Rules can be identified to the advantage of the observer.

I've discussed some of this before, trying to define what makes a pattern a pattern. However, no one wins a Nobel Prize or gains competitive advantage by rehashing existing patterns. Some do, however, use new patterns to make novel discoveries or exploit behavioral trends; the knowledge from new patterns gives them an edge using the same data that everyone else has at hand.

A downside to pattern reliance is the human predisposition to assign meaning to events when there isn't necessarily any. Over the past 500 years, two related, but different concepts arose, to help sort out the problems that the search for an explanation can cause.
  • Scientific process: The scientific process focuses on creating a hypothesis and performing experiments (or using other means) to collect data to support or refute the original hypothesis.
  • Statistics: Statistics developed as a discipline to use data to combat a tendency to rely on subjective experiences (data can be thought of as a more "objective" experience, although how it's measured, collected, and recorded becomes its own can of worms).

Pattern based analytics can be thought of as the process of taking both of these disciplines and bring them together to search for patterns in "big data". Once there are too many observations or too many variables for a single person to reasonably expect to be able to synthesize or analyze on their own in a reasonable timeframe, it's time to augment the human capability with analytics software.

Using tried and tested techniques from both the scientific process and statistics, pattern based analytics software should, at a minimum, help you with the following tasks:
  • Search for patterns in your dataset - find interactions between variables in your dataset that impact other variables that can be classified as patterns
  • Create hypotheses or discover rules - assert the existence of a pattern and assess its strength
  • Validate a pattern's existence - make sure the pattern is valid and help figure out if it's something that can be controlled (this is tricky, regardless of what you're doing. See how analytics can get in your way.)

2 comments:

  1. Yeah, it's easy for us humans to find patterns we think exist. We can fool ourselves into thinking that our patterns are the most important patterns and that we made some big discovery. It's quite possible that our new discovery is just due to random noise, an invalid assumption, or faulty analysis. This has been an issue in trying to find a cure for cancer. This is where software such as Leapworks Pattern Based Analytics can help. It just looks at the data, and automatically finds the most prominant patterns in the data. Of course it's still up to the user to create a useful data set and make proper conclusions from the patterns found, but the hard work of finding valid patterns is made easy.

    ReplyDelete
  2. Another possible task that Pattern Based Analytics can help address is Transfer Learning. In real life, people usually don’t learn from scratch. We apply our previous knowledge on new problem to learn new patterns. For example, experience in playing the celesta can help us learn how to play the piano. In data mining, we can transfer a pattern learned in one feature domain to another, or transfer a pattern in the same feature domain from one data distribution to another. For example, in healthcare we may identify a pattern based on a small dataset and we can refine our pattern as more data becomes available, or we can apply this pattern on another dataset to help us learn new patterns. For sales problems, patterns for one product can be transferred to another similar product to help us build a model quickly in high quality. Please note, the seed patterns can even be specified by a domain expert. If we can do pattern transfer and refine well, the potential is huge.

    ReplyDelete

Note: Only a member of this blog may post a comment.