Double Match Triangulator

Version 2.1.1,   28 Mar 2018

 
 
Double Match Triangulator (DMT) is an autosomal DNA analysis tool.
It combines two different people's segment match files
from Family Tree DNA, 23andMe, MyHeritage DNA or GEDmatch to provide double match and triangulation data
that can be used to help visualize and determine genealogical relationships.
 
Double Match Triangulator was conceptualized and developed by Louis Kessler (Behold Genealogy).
DMT placed 3rd in the Innovator Showdown at RootsTech 2017 in Salt Lake City
 
 
 
 

The Purpose of DMT

 
Double Match Triangulator (DMT) is an autosomal DNA analysis tool. DMT combines segment match data of two or more people to find all the double matches and all the triangulations between them.
 
It is difficult with current tools to identify all the shared segments between all the people you match to. Double Match Triangulator was developed to do this for you and present your segment matches visually in an Excel spreadsheet. With DMT, you can skip the long and involved step of looking for and identifying triangulations. Instead you can jump right ahead into the analysis of your segment matches.
 
The information Double Match Triangulator (DMT) presents will help you to:
  • Phase your matches to determine which are on your paternal and which are on your maternal side,
  • Identify relatives who are sharing segments with you that may come from common ancestors,
  • Determine the ancestors who passed you each segment of your DNA (i.e. map your segments to your ancestors).
 
 

Segment Match Data

 
DMT uses segment match data that is available from Family Tree DNA, 23andMe, MyHeritage DNA, or GEDmatch. These are the only companies that provide you with all your segment match data. As of January 2018, neither AncestryDNA nor MyHeritage DNA give you access to all of your segment match data. To use your AncestryDNA or MyHeritageDNA test results, you have to upload your data to Family Tree DNA or GEDmatch and then download from there. Family Tree DNA charges $19 to transfer there and be able to download segment match data. Transfer to GEDmatch is free, but it requires a Tier 1 subscription ($10 for a month) to access the utility that enables you to download segment match data from their site.
 
Segment match data is a bit different from different companies. DMT can read the segment match data from the 3 companies that supply it.
 
Family Tree DNA gives you all single matches down to 1 cM so you get the most detail. Even though most small single matching segments are false, double matching and triangulating will eliminate many of the false matches. Family Tree DNA unfortunately sorts their segment match data by the name of the tester. If two people have the same name, e.g. both being John Smith, their results will be mixed together. DMT does its best to detect, inform you of this, and eliminate duplicate matches where possible. You can only download segment match data for the kits you administer. To get the segment match data from your DNA relatives, you will have to contact and ask them to download their segment match data and send it to you.
 
23andMe matches are only provided for people who specifically opt in to participate in DNA Relatives sharing, so you only get the segment matches of some of the people you match to there. They will give you all single matches down to 5 cM. People are organized by match names. However, since no ID is given, you cannot tell who is who when two testers have the same match name. DMT will combine the segments of people with the same matchname and will inform you of this, and eliminate duplicate matches where possible. You can only download segment match data for the kits you administer. To get the segment match data from your DNA relatives, you will have to contact them and ask them to download their segment match data and send it to you.
 
MyHeritage DNA gives you all single matches down to 6 cM. People are organized by match name with different matches separated by blanks. However, since no ID is given, you cannot tell who is who when two testers have the same match name. DMT will combine the segments of people with the same matchname and will inform you of this, elliminate duplicate matches where possible. You can only download segment match data for the kits you administer. To get the segment match data from your DNA relatives, you will have to contact them and ask them to download their segment match data and send it to you.
 
GEDmatch match data is only provided for the closest 10,000 segment matches. Because of this, one person's match list may go down to a lower minimum cM than another person's and the the two will therefore not double match below the higher of the two minimums. The segment match data does not download but is only displayed as a report on the webpage. Double Match Triangulator includes a tool to help you capture the data and download it. The biggest advantage of GEDmatch is that you can download anyone's segment match data yourself and do not need to contact anyone else to do so. You can get the other people's data using their ID which you can get from the GEDmatch one-to-many matches report. However, you can only get data of people who have uploaded already to GEDmatch. If someone you want to compare is not at GEDmatch, you will have to contact them and get them to upload their data there.
 
Generally, Double Match Triangulator will only give you good results if you compare Family Tree DNA data to Family Tree DNA data, 23andMe data to 23andMe data, MyHeritage DNA data to MyHeritage DNA data, or GEDmatch data to GEDmatch data. Each company uses their own user name and/or ID number to identify people. DMT uses the name/ID for comparisons so if people's names/IDs differ, they will not match. Also, each company gives slightly different match results than the other do. You are likely safe to compare different company's results that have been transferred to Family Tree DNA or uploaded to GEDmatch since the raw data has been converted to that company's common denominator and a common matching method is used.
 
 
 

Single Match Triangulation

 
The basis upon which DMT is founded is the technique of Triangulation. When three people, Person A, Person B, and Person C all match each other on a particular segment of DNA, that segment is said to be Triangulated. As long as none of the three matches is a random match (by chance), then the three people could have got the segment from a common ancestor who passed it down to all of them.
 
The standard method of doing triangulation without DMT is using what I'll call Single Match Triangulation. A person, let's call you Person A, looks at your own chromosome matches and finds people who match you and each other on the same segment. For example, below is an example from the Family Tree DNA's chromosome browser tool showing where 3 different people match Person A over the 23 chromosomes. Family Tree DNA can compare up to 5 people with Person A at once. Each compared person is represented as a different color.
 
 
You could see a location on chromosome 14 where 2 people match Person A on the same segment.
 
The single match triangulation process is as follows:
1. Find a segment where at least two people match Person A.  Let's call them Person C1 and Person C2. You should know in advance that they both match on either your maternal side or your paternal side.
2. If the segment that all three match is of sufficient size (at least 15 cM is usually considered sufficient), then conclude that this segment triangulates and that Persons A, C1 and C2 all have a common ancestor from this segment.
3. If the segment is not of sufficient size in step 2, then verify that Person C1 and Person C2 match each other on that segment.
4. If they do, now the criteria for suffiicient size is reduced (down to 7 cM) and if the size is now sufficient, then conclude that this segment triangulates and that Persons A, C1 and C2 all have a common ancestor from this segment.
5. If the segment is smaller than 7 cM, then you'll need to find at least one or two other matches over the same segment, and verify that Person C3 and Person C4 match one or more of the others before assuming that they all form a triangulation group.
 
The minimum length criteria is to prevent random matches by chance. When the segment is large enough, the likelihood of a chance match is minimal. Matching 3 or 4 or more people makes chance matches less likely and reduces the segment size requirement. For smaller segments, there's safety in numbers. The more segments 3-way matching on the same segment, the more likely you have a group of people with a common ancestor. But beware: any one person may match randomly to all the others, and they will look just like relatives who do share a common ancestor. Matching on other segments with some of the same people will add evidence to the pot that they the matches may be valid.
 
Steps 3 and 5 involve a very bothersome task. You cannot verify the other matches from your own match information alone. Using "In Common With" is not good enough, because it does not guarantee that the other people match on the same segment. Instead, you must contact one of the other people, and ask them to check their chromosome browser tool and see if they match the other people on the same segment.
 
 
 

Double Match Triangulation

 
By just using one person's segment matches, you do not have enough information to triangulate. However, once you get a second person's segment matches, you do.
 
With the segment matches of two people, lets call them Person A and Person B,  you can now find all the people that Person A and Person B both match to on the same segments. In other words, you can make a list of all the Person A matches to Person C that overlap on the same segment that Person B matches Person C.  I call each of these a Double Match
 
This makes up two of the three sides of a triangulation. To get the triangulated segments, you only need to inspect the Person A to Person B segment matches and find those that overlap with the double matches. You already have the Person A to Person B matches because both segment match files include them.
 
So here is a comparison between single match triangulation and double match triangulation:
 
Some versus All:
Single match triangulation helps to find a few people who might triangulate with you on one segment.
Double match triangulation finds every double match and every triangulation on all segments where Person B is involved.
 
Unverified BC versus Verified BC:
Single match triangulation tells you where Person A matches Person B and Person A matches Person C but you don't know if Person B matches Person C on that segment or if they're even related to each other.
Double match triangulation tells you where Person A matches Person C and Person B matches Person C. So the hard-to-verify BC gets verified for you by the double match. You also know if AC matches or not.
 
Don't mistake unconfirmed single match triangulation, where the Person B to Person C match has not been verified, with the true triangulation done by Double Match Triangulator.
 
 

Downloading Segment Match Data

 
All the above is really a lot of work. Identifying matches using a chromosome browser can be tedious as you can only do them a few at a time. The only way to verify all the third matches is by contacting the other people and having them check for you, and this takes time and diligence. At GEDmatch, you can verify these third matches yourself, but it's still a lot of work.
 
You can simplify the identification of matches. Family Tree DNA, 23andMe and MyHeritage DNA allow you to download all of your segment matches in one file, and for GEDmatch, Double Match Triangulator helps you download their segment matches.
 
 

1. Family Tree DNA

 
At Family Tree DNA, you download match data from their chromosome browser page by clicking on their link that is named: "Download All Matches to Excel (CSV Format)" that is highlighted and shown by the hand pointer below.
 
This will download a special file named:   nnnnnn_Chromosome_Browser_Results_yyyymmdd.csv
where:
   nnnnnn is your Family Tree DNA kit number,
   yyyymmdd is the date of your download, and
   .csv indicates this is a comma delimited file which can be read by Excel and other programs.
 
Do not rename the "nnnnn_Chromosome_Browser_Results" part of the segment match file names or change their ".csv" extension. DMT looks for that to find segment match files and will not accept files not following that structure.  However, you may change the filename starting at the date and up to the period. I do this to all of my segment match files so that I can easily identify who the file belongs to, e.g:
nnnnnn_Chromosome_Browser_Results_yyyymmdd_John_Smith.csv
 
When loaded into Excel, the file can be seen to contain all the chromosome segment matches:
 
Note: All surnames have been changed in all examples to a letter and five digits to protect the privacy of the test takers.
 
Column A (NAME) is the name of the tester.
Column B (MATCHNAME) is the person who matches. The file lists matchnames in alphabetical order, so blank names are first.
Column C (CHROMOSOME), D (START LOCATION) and E (END LOCATION) is the chromosome number with the start and end locations of the matching segment on that chromosome.
Columns F (CENTIMORGANS) and G (MATCHING SNPS) is the distance in Centimorgans (cM) and number of SNPs for the matching segment.
 
 

2. 23andMe

 
At 23andMe, Select:  All Tools -> DNA Relatives and then go to the bottom of the page and select "Download aggregate data" that is shown by the hand pointer below:
 
 
This will download a special file named:   nnnnnn_relatives_download.csv
where:
   nnnnnn is your name
   .csv indicates this is a comma delimited file which can be read by Excel and other programs.
 
Important note:  23andMe does not include the name of the tester in the file. The only indication who the tester is is from the nnnnnn in the filename. If the name in the filename does not match the reference to that person in the other person's match file, then it can't find the matches between them. DMT gives you a warning about this, and tells you to what you should do to change to change the filename.
 
Do not rename the "_relatives_download" part of the segment match file names or change their ".csv" extension. DMT looks for that to find segment match files and will not accept files not following that structure.  However, you may change the filename after the word download. I often add the date so that I can compare with my previous download to see what new matches have been added, e.g:
nnnnnn_relatives_download_yyyymmdd.csv
 
When loaded into Excel, the file can be seen to contain all the chromosome segment matches, as well as a lot of other information:
 
 
Column A (Display Name) is the name of the person who matches.
Column B (Surname) is the surname of the person who matches.
Column C (Chromosome Number), D (Chromosome Start Point) and E (Chromosome End Point) is the chromosome number with the start and end locations of the matching segment on that chromosome.
Columns F (Genetic Distance) and G (# SNPs) is the distance in Centimorgans (cM) and number of SNPs for the matching segment.
Columns H through AE contain additional data including: Link to Compare View, Sex, Birth Year, Set Relationship, Predicted Relationship, Relative Range, Percent DNA Shared, # Segments Shared, Maternal Side, Paternal Side, Maternal Haplogroup, Paternal Haplogroup, Birthplace, Residence, Family Surnames, Family Locations, Maternal Grandmother Birth Country, Maternal Grandfather Birth Country, Paternal Grandmother Birth Country, Paternal Grandfather Birth Country, Self Reported Ashkenazi Jewish Descent, Notes, Sharing Status.  Double Match Triangulatior currently does not make use of this extra data.
 
 

3. MyHeritage DNA

 
At MyHeritage DNA, go to the DNA Matches page, Select the "Advanced options" dropdown, and select "Export shared DNA segment info for all DNA Matches".
 
 
This will download a special file named:   nnnnnn DNA Matches shared segments dddddd.csv
where:
   nnnnnn is your name
   dddddd is the date, and
   .csv indicates this is a comma delimited file which can be read by Excel and other programs.
 
Do not rename the " DNA Matches shared segments " part of the segment match file names or change their ".csv" extension. DMT looks for that to find segment match files and will not accept files not following that structure.
 
When loaded into Excel, the file can be seen to contain all the chromosome segment matches, as well as a lot of other information:
 
 
Column A (Name) is the name of the tester.
Column B (Match name) is the name of the person who matches.
Column C (Chromosome), D (Start Location) and E (End Location) is the chromosome number with the start and end locations of the matching segment on that chromosome.
Columns F (Start RSID) and G (End RSID) are the Refererence SNP identification (i.e. the names for) the starting and ending SNPs.
Columns H (Centimorgans) and I (SNPs) is the distance in Centimorgans (cM) and number of SNPs for the matching segment.
 
 

4. GEDmatch

 
At GEDmatch, you need to pay for Tier 1 services to get access to a report to display your match data. Tier 1 services currently cost $10 a month with a minimum one-month signup.
 
With Tier 1 services enabled, go to Tier 1 Utilities -> Matching Segment Search. Enter your Kit Number and press the Submit button.
 
 
Once the report completes, you will get something looking like this:
 
 
GEDmatch does not give you a way to download this segment match data. So built into Double Match Triangulator, you'll find a "Save GEDmatch" button just below the File A dropdown box:
 
 
 
If you click on that box now, you'll get this message box:
 
 
What you must do in your web browser is to select everything from the GEDmatch page and copy it to the clipboard. Depending on your web browser, you can do this one of 3 ways:
  • From the menu:  Edit->Select all,
  • With your mouse right-click and pick "Select all", or
  • With the keyboard type Ctrl+A (i.e. while holding down the Ctrl key, type "A").
     
Then copy the selection to the clipboard one of these 3 ways:
  • From the menu:  Edit->Copy,
  • With your mouse right-click and pick "Copy", or
  • With the keyboard type Ctrl+C (i.e. while holding down the Ctrl key, type "C").
 
Once you've done that, you can click the "Save GEDmatch" button and it will open a "Save As" window to allow you to save the segment match data where you want. In doing so, it suggests a file name for you in the form of kkkkkk_GEDmatch_Matching_Segments_nnnnnn.csv
where:
   kkkkkk is the GEDmatch kit number,
   nnnnnn is the GEDmatch name of the tester, and
   .csv indicates this is a comma delimited file which can be read by Excel and other programs.
 
When loaded into Excel, the file can be seen to contain all the data from the GEDmatch matching segment results:
 
 
Column A (Tester) is the name and kit number of the tester.
Column B (Kit) is the kit number of the person who matches.
Column C (Chr), D (Start Position) and E (End Position) is the chromosome number with the start and end locations of the matching segment on that chromosome.
Columns F (cM) and G (SNPs) is the distance in Centimorgans (cM) and number of SNPs for the matching segment.
Columns H and I contain additional data provided by GEDmatch: Sex and Email. Double Match Triangulator currently does not make use of this extra data.
 
Important to note:  GEDmatch's Matching Segment Search report for some unknown reason excludes segment matches with the subject's parents, children or siblings. If you want to include these close relatives in your GEDmatch Matching Segments file, you'll have to do a GEDmatch "One-to-one compare" and a GEDmatch "X One-to-one" with them and then manually add the matches to your GEDmatch Matching Segments file. DMT does not at this time include a tool to help you do this.
 
 

Some Things To Understand

 

DMT Can Only Find Triangulations Contained In The Match Data

 
If you have a match file that was downloaded prior to the date that a DNA relative tested, then that relative will not be in the match file. But the relative's match file will contain the first person's match. Don't feel you have to download new match files every time you get a new file from a relative. Older files will usually do fine. Just realize you won't get any double matches or triangulations with people who tested after the file was downloaded. The only time you'll need to update a file is if there are some people who recently tested who you want included in the analysis. Then you should update your Person A and Person B files so that they both will include those people.
 
Another reason why valid matches may be missing is when one person does not meet the company's minimum match criteria with another persson. Then even though they may actually match on a few segments, neither of them will be included in each other's match data.
 
At GEDmatch segment match data only give the 10,000 closest matches, so one person's GEDmatch data may go down to one cM level, and a second person's GEDmatch data may go down to a different cM level. In this case one person's data may match the other and but the other's may not match back.
 
Despite this, be assured that the double matches and triangulations found by DMT include every double match and every triangulation contained in the match data. Valid matches that are not in the data obviously cannot be determined.
 
 

Triangulation Does Not Mean IBD, But IBD Means Triangulation

 
Every segment of your DNA was passed down through one specific ancestral line. When two people were passed the same segment from the same ancestor, that segment is said to be Identical By Descent (IBD). Two or more people having IBD segments means that they share a common ancestor.
 
It is difficult to prove a segment match is IBD. Whereas, triangulation is simply the case where three people all match each other on the same segment, and through double matching you can always tell for sure whether the segment triangulates or not.
 
Some people mistakenly believe that triangulating segments are IBD. That thinking is incorrect. Small double matching or triangulating segments of 7 cM or less may still be match randomly.  Also, triangulating segments of any size may match three people through both opposing parental chromosomes and thus not be IBD.
 
However, any segment that truly is IBD must triangulate.
 
Many segments that are not IBD are eliminated from consideration by DMT through double matching and triangulation. But no IBD segments will be eliminated, since triangulations are never eliminated.
 
 

Two Related People May Not Have Any IBD Segments

 
Once you get to 2nd cousins once removed, or 3rd cousins and further, there is a chance that those two relatives do not share any DNA. The probability of this grows larger the further the relationship gets. So just because two people do not share any IBD segments, that does not mean they are not related. They may just not be DNA relatives.
 
If you find a cousin you don't share DNA with, there's not much you can do. That cousin won't help be able to help you in your goal to map your own DNA to your ancestors. But if you have parents, siblings, aunts, uncles or cousins who have tested, they may share DNA with that cousin. If you plan to map any of their DNA to their ancestors, then that cousin may be useful.
 
The bottom line with double matching is that you'll want to use B people who match you and triangulate somewhere over your DNA with you.
 
 

The Double Match Triangulator program (also called DMT)

 
Two segment match files can contain a lot of information. Each file can have as many as 10,000 or even 100,000 or more segment matches with 1,000 or even 10,000 people. Putting this information together quickly and presenting it clearly so that it can be analyzed is not a trivial task. I experimented on various ways of displaying the results in Excel and came up with a segment map that I thought would be useful. Then, I created Double Match Triangulator to automate this.
 
DMT will find ALL the double matches between two segment match files. This could be 1,000 or even 10,000 double matches. Many of these will overlap with one of Person A and Person B matches and therefore fully triangulate.
 
Since single matches are not included, a lot of the problematic segments that match by chance in single match triangulation are not even considered in double match triangulation.
 
The double matches that don't triangulate will have two of the three matches, AC and BC, and are referred to as Missing AB Matches. These are matches on both of Person C's chromosomes of a pair, so one match (either AC or BC) has to be from Person C's father, and the other from Person C's mother. It is possible for these missing AB matches are overlapping segments from both chromosomes of a chromosome pair from a common ancestor, recombined by a child of two of their descendants. Or more likely, one segment could be from an ancestral father and the other from the ancestral mother (see more discussion on this on my blog post: Triangulation and Missing a-b Segments). And of course, it could also be a by chance match just like a triangulation can be.
 
Overlapping double matches are grouped together into Triangulation Groups (TG). DMT denotes these on its Map page by drawing a thick box around them. The triangulation groups, if not formed by chance, could denote DNA segments that are IBD and come from common ancestors of Person A, B and C. The triangulation groups are taken from the point of view of Person A, so you'll see the AC matches tend to define their limits.
 
 

Combining Results

 
Comparing one person with one other gives you a lot of information you can analyze. But if you compare one person to multiple people and combine the results, additional methods of analysis become possible. Double Match Triangulator allows you to match one person, Person A, with multiple B people and then combine the results. By doing so, you have the triangulations of A with B1, A with B2, A with B3, etc, all put together. This will allow you to determine who among the B1, B2, B3,... match each other on any segment and who don't match. When you have 2 or more B's matching each other, you are isolating these matches all to one parental chromosome, and the C people that they all match to will also be on the same chromosome.
 
This analysis can be used to assign matching segments to Person A's parents and help to map the segment back up to Person A's ancestors.
 
To make this data most useful for analysis, the matches are displayed to you by grouping every AC segment match together with all the B people who double match or triangulate on that segment.
 
 

What Follows

 
The details of the program and my thoughts on interpreting the DMT output are described on the pages that follow.