Double Match Triangulator

Version 1.5, 22 Mar 2017

Double Match Triangulator is an autosomal DNA analysis tool.
It combines two different people's segment match files
from FamilyTreeDNA, 23orMe, or GEDmatch to provide Double Match and Triangulation data
that can be used to help visualize and determine genealogical relationships.
Double Match Triangulator was conceptualized and developed by Louis Kessler (Behold Genealogy)

The Purpose of DMT

Autosomal DNA analysis requires comparing the DNA segments of yours that match others. You do this to try to determine segments that came from common ancestors and then see who is matching on that segment to help you determine who the common ancestor might be.
It is difficult with current tools to determine all the segments of all the people you match to. Double Match Triangulator was developed to do the determination of all your matches from FamilyTreeDNA  and visually present them to you in an Excel spreadsheet. DMT can also work with matches from 23andMe or GEDmatch if you convert them into FamilyTreeDNA format. With DMT, you can skip the long and involved identification step and jump right ahead into the analysis.
The information Double Match Triangulator (DMT) presents will help you to:
  • Determine the ancestors who pass you each segment of your DNA (i.e. map your segments to your ancestors),
  • Identify relatives who share your common ancestors on those segments, and
  • Phase your matches to determine which are on your Paternal and which are on your Maternal side.
I hope in the future that I can develop algorithmic techniques that will let DMT do more of the analysis for you.
DMT works best with FamilyTreeDNA match data since it has less restrictions to the matches in number, size, and coverage than 23andMe data or GEDmatch data. AncestryDNA does not let you download segment matches so your Ancestry data must be uploaded to GEDmatch before DMT can use it. You cannot compare data between different companies because DMT uses the tester's name to match people, and the tester's name is often different at the diferent companies. Also, the companies each use different algorithms to determine segment base addresses so the company match locations differ.

Single Match Triangulation (SMT)

The basis upon which DMT is founding is the technique of Triangulation. When three people, Person a, Person b, and Person c all match each other on a particular segment of DNA, that segment is said to be Triangulated, and the three people form a Triangulation Group (TG). As long as none of the three matches is a random match (by chance), then the three people could have got the segment from a common ancestor who passed it down to all of them.
The standard method of doing triangulation is what I'll call Single Match Triangulation (SMT). A person, let's call you Person a, looks at your own chromosome matches and finds people who match you and each other on the same segment. For example, below is an example from the FamilyTreeDNA's Chromosome Browser tool showing where 3 different people match Person a over the 23 chromosomes. FamilyTreeDNA can compare up to 5 people withe Person a at once. Each compared person is represented as a different color.
You could see a location on Chromosome 14 where 2 people match Person a on the same segment.
The Single Match Triangulation process is as follows:
1. Find a segment where at least two people match Person a.  Let's call them Person c1 and Person c2.
2. If the segment that all three match is of sufficient size (at least 15 cM is usually considered sufficient), then conclude that this segment triangulates and that Persons a, c1 and c2 all have a common ancestor from this segment.
3. If the segment is not of sufficient size in step 2, then verify that Person c1 and Person c2 match each other on that segment.
4. If they do, now the criteria for suffiicient size is reduced (down to as little as 5 cM) and if the size is now sufficient, then conclude that this segment triangulates and that Persons a, c1 and c2 all have a common ancestor from this segment.
5. If the segment is smaller than 5 cM, then you'll need to find at least one or two other matches over the same segment, and verify that Person c3 and Person c4 match one or more of the others before assuming that they all form a Triangulation Group.
The minimum length criteria is to prevent random matches by chance. When the segment is large enough, the likelihood of a chance match is minimal. Matching 3 people makes chance matches less likely and reduces the segment size requirement. Analysis of Triangulation indicates that the sufficient size may be reduced down to as little as 5 cM. For smaller segments, there's safety in numbers. The more segments 3-way matching on the same segment, the more likely you have a Triangulation Group.
In Step 3 and Step 5, there is a very bothersome task. You cannot verify the other matches from your own FamilyTreeDNA information. Using "In Common With" is not good enough, because it does not guarantee that the other people match on the same segment. Instead, you must contact one of the other people, and ask them to check their Chromosome Browser tool and see if they match the other people on the same segment.

FamilyTreeDNA's Chromosome Browser Results (CBR) File

All the above is really a lot of work. Identifying matches using the FamilyTreeDNA's Chromosome Browser can be tedious as you can only do them 5 at a time. The only way to verify all the third matches is by contacting the other people and having them check for you, and this takes time and diligence.
You can simplify the identification of matches. FamilyTreeDNA allows you to download all of your segment matches in one file. You do so from their Chromosome Browser by clicking on their link that is named: "Download All Matches to Excel (CSV Format)" highlighted and shown by the hand pointer below.
This will download a special file named:   nnnnnn_Chromosome_Browser_Results_yyyymmdd.csv
   nnnnnn is your FamilyTreeDNA kit number,
   yyyymmdd is the date of your download, and
   .csv indicates this is a comma delimited file which can be read by Excel and other programs.
Do not rename the "nnnnn_Chromosome_Browser_Results" part of the CBR file names or change their ".csv" extension. DMT looks for that to find CBR files and will not accept files not following that structure.  However, you may change the filename starting at the date and up to the period. I do this to all of my CBR files so that I can easily identify who the file belongs to, e.g:
When loaded into Excel, the file can be seen to contain all the chromosome segment matches for Person a:
Note: All surnames have been changed in all examples to a letter and five digits to protect the privacy of the test takers.
Column A (NAME) is Person a.
Column B (MATCHNAME) is the Person c matching the Person a on that segment. The file lists matchnames in alphabetical order, so blank names are first.
Column C (CHROMOSOME), D (START LOCATION) and E (END LOCATION) is the chromosome number with the Start and End locations of the matching segment on that chromosome.
Columns F (CENTIMORGANS) and G (MATCHING SNPS) is the length and size of the matching segment.
Using this file can make Single Match Triangulation easier since it will show (with a bit of Excel mastery) every segment with two or more people who match with Person a over at least part of the segment.
However, Person a is still only matched to each of these people. One of c1 and c2 will still need to be contacted to verify that Person c1 matches to Person c2, and one of c3 and c4 to verify that Person c3 matches Person c4 on a segment.

Double Match Triangulation (DMT)

After seeing this Chromosome Browser Results file, I thought it might be easier to determine true triangulations if I had another person's Chromosome Browser Results file. I contacted several people in my match list and asked if they could download their CBR files and send them to me. I put two CBR files together into a single Excel file and played with it. See my blog posts on this: EAST Part 1 and EAST Part 2.
I realized that with two Chromosome Browser Results files, you have all the matches of two people, Person a and Person b,  You can now find all the people that Person a and Person b both match to on the same segments.  Once Person b has let you use their CBR file, you no longer have to contact any of the other people, because Person b's file tells you which of them match on each segment with you.
In other words, you get a list of all the Person a matches to Person c on the same segment that Person b matches Person c.  I call each of these a Double Match
This makes up two of the three sides of the triangulation. To get the Triangulated segments, you only need the Person a to Person b segment matches and these are already included in your own CBR file.
The subtle difference of what is happening here, is that Single Match Triangulation helps to find the people who Triangulate with you on one segment, whereas Double Match Triangulation finds every Double Match and Triangulation on all segments where Person b is involved. I identify Double Match segments where Person a also matches Person b using the term: Full Triangulation, to indicate that Person a matches Person c, Person b matches Person c, AND Person a matches Person c on that segment.
Don't mistake Full Triangulation with the false unconfirmed Single Match Triangulation, where Person a matches Person c1, Person a matches Person c2, but Person c1 and c2 have not been shown to match on the segment or even been shown to have any matches in common. That is a common mistake that can be made because there is no guarantee in this case that Person c1 and c2 are at all related.

The Double Match Triangulator program (also called DMT)

Two Chromosome Browser Results files can contain a lot of information. Each file can have as many as 10,000 or even 100,000 or more segment matches with 1,000 or even 10,000 people. Putting this information together quickly and presenting it clearly so that it can be analyzed is not a trivial task. I experimented on various ways of displaying the results usefully in Excel and came up with a segment map that I thought would be useful. Then, I created my Double Match Triangulator program to automate this.
DMT will find ALL the Double Matches between two CBR files. This could be 1,000 or even 10,000 double matches. As many as 25% of these will overlap with one of Person a and Person b matches and therefore fully Triangulate.
Since single matches are not included, a lot of the problematic segments that match by chance in Single Match Triangulation are not even considered in Double Match Triangulation.
The Double Matches that don't Triangulate will have two of the three matches, a-c and b-c, and are referred to as Missing a-b Matches. These are matches on the opposite halves of Person c's chromosome, so one match (either a-c or b-c) has to be from Person c's father, and the other from Person c's mother. It is possible for these Missing a-b Matches to indicate both chromosome halves of a common ancestor, recombined by a child of two of their descendants. Or more likely, one half could be an ancestral father and the other the ancestral mother (see more discussion on this on my blog post: Triangulation and Missing a-b Segments). And of course, it could be a by chance match that could lead to two different people who may be related or unrelated.
Overlapping Double Matches are grouped together into Double Match Groups (DMG). DMT denotes these on its Map page by drawing a thick box around them. The Double Matches which also Triangulate define Triangulation Groups which, if not formed by chance, will denote a DNA segment that comes from a common ancestor of Person a, b and c.
The details of the program and my thoughts on interpreting the DMT output are described on the pages that follow.