Jump to content

How to mathematically separate the wheat from the chaff


Guest Cryptic Megafauna

Recommended Posts

Guest Cryptic Megafauna

The problem with sightings databases is that is an amalgam of all levels of accuracy, reliability, and may not represent an accurate picture of Sasquatch location or habitat or even, perhaps, mis-identification or hoaxing.

 

So here it to a geo spatial method for starting toward a meaningful data set that might generate better analytic gold.

 

Let's start at the low end since that is always where you will need to begin.

 

To create a good sample use QGIS and get Tiger county data set for USA and States dataset: the format should be shape-file or .shp.

I would suggest GIS folder in home directory or at root of user.

 

use add vector layer button on gui console for both and save your project and data in a directory you will remember.

 

Now open the attribute table of the states layer and construct a SQL query to select the state of interest (in the spirit of keeping it simple maybe start with on state like CA.

It may be that you can find a counties just for that state on Tiger which would be even simpler. Save the selection as a new (vector) layer.

 

A lot of this stuff you may need to play around a little to learn or ask a question.

 

Now use vector geoprocessing tools >> clip and clip the counties by the state of interest (the new layer). It will have the options in the tool.

 

>>save the result as new layer or add to mapr>>click yes

 

For the purpose of the exercise we will use the Patterson_Gimlin site since we know it is 100% the best accuracy we will ever get.

 

You need to get this data point so I won;t go into it here but you need a shape file and new layer that is a vector for the point.

 

 You could simply import the whole BFRO dataset and extract all the A class sightings, save as layer, mask to the county and/ or State of interest.

 

Add the results as a new layer using the same clipping procedure to the area of interest (state and / or county)

 

Open the new layer and open the attribute table and find your point, highlight the row and save selected as a new layer. That is your P-G point.

 

The rest is untested so I can't give specific instructions but my process would be the same as yours, namely online research and development.

I'm sure the logical target exists.

 

Get a vegetative index or vegetative classification index or such. Likely you will need to find some satellite date or data that has been processed to classify the vegetation index.

Probably form a satellite data download or government site. Ideas are spot 7, NOAA, http://www.ospo.noaa.gov/Products/land/vegetation.html USGS, or other government agencies or remote sensing sites that have data downloads. You can even process your own but then you have another learning curve.

 

Untested the next steps but you will probably loading a raster data set as a new layer.

 

Clip the raster to your point of interest and county, state, or whatever.

 

Run a buffer for the point and then use the buffer to extract by clipping the raster index for the point. I would suggest the smaller the buffer the better the result so say .1 miles.

 

Now you have a known vegetative index for a known good sighting. caution: the vegetation may have changed since the sighting.

 

The next step I would have to R&D but would be finding the vegetation's index that are indexed in you buffer raster extract; probably a list in the attribute table.

 

Now run an association for those vegetation's, run an association for that elevation + _ 10 meters say and run an association in the BFRO or BFF data for time of year.

 

Do the processioning masking again and buffer your elevation or get a range for an extract (you may need to play with elevation dataset which is too much more complex for here).

 

Clip your vegetation matching extract to your elevation buffer or SQL selection result.

 

Clip you BFRO A by the result.

 

The points you now have may result in a better likely hood of being accurate and if separated to the right month better still.

 

Now do the same for other months and other vegetations and elevations for other know good data and back mask for the BFRO A and it may reveal the better data.

 

Other good results are what plant life does A Sasquatch depend on at what times of the year and what elevation, movement patterns, human population density and activity levels drive the contact phenomenon and what behavior and migration patterns are being represented. Even sizes and locations of seasonal ranges.

 

Additional benefits are seasonal diet and just higher accurate data for projections and analysis.

 

Hopeful this gives you some ideas on how to use geospatial data for BF research.

 

 

Link to comment
Share on other sites

  • 3 weeks later...

Any given isolated report may be a hoax, or subject to any number of sources of error.  A basic rule of empirical science is that you never draw any firm conclusions based on a single experiment or observation.  Scientists strive for repetition and replication of experimental or observational results.  Large sighting databases are valuable to Bigfoot research because they provide data in necessary quantities for drawing conclusions.

 

The unique weakness of Bigfoot sighting databases, and "paranormal" data in general, is that we don't know to what degree the body of data has been tainted by heading or systematic misperception.  This is where mathematical methods come into play.

 

The best mathematical analysis of a Bigfoot sighting database that I am aware of is the analysis published by Glickman in 1998.  The basic thrust of the method is that any person, in any place, at any time, is capable of submitting a hoaxes report.  For this reason, the rate of hoaxing should be proportional to the human population.  Similarly, if a Bigfoot sighting report is prompted by an encounter with an actual animal, then the more densely packed the people are in a given area, the more likely one or more of those people would be to encounter an animal residing within that area.  Consequently, the rate of sighting reports of an actual animal should be proportional to the human population density.

 

Using these two principles, Glickman analyzed the distribution of reports at the state level.  He used a hierarchical cluster analysis (a method used to determine the similarity of individual pieces of data within a dataset) to divide the report database into two groups.  Group A covered Alaska, Washington, Oregon, Northern California, Idaho, and Montana.  Group B covered everything else.

 

He found that in Group A, the number of reports per state showed a correlation with both the population and population density.  The conclusion is that reports in these states are a result of a combination of hoaxing and actual animal sightings.  On the other hand, Group B showed only a correlation with population, the conclusion being that reports in the rest of the country are solely due to hoaxing.

 

In 2005 and 2006 I extended Glickman's analysis to check for correlations between the number of Bigfoot reports per state and the black and brown bear population densities in those states.  There was no correlation with black bears in either Group A or B, and actually a negative correlation with brown bears in Group A.  My conclusion was that we could eliminate the only two animals that could reasonably be mistaken for Bigfoot, and speculate that Bigfoot sightings really are due to an uncataloged animal species.  Moreover, I concluded based on the brown bear data that Bigfoot and grizzly bears are natural enemies.

 

Recently I've been updating all of these analyses based on the BFRO database instead of the outdated Green data that both Glickman and I had previously used.  The BFRO database also allowed me to use Canada in the analysis.  I should be finished with the analysis and write up shortly.

  • Upvote 3
Link to comment
Share on other sites

Guest Cryptic Megafauna
23 hours ago, Mendoza said:

Any given isolated report may be a hoax, or subject to any number of sources of error.  A basic rule of empirical science is that you never draw any firm conclusions based on a single experiment or observation.  Scientists strive for repetition and replication of experimental or observational results.  Large sighting databases are valuable to Bigfoot research because they provide data in necessary quantities for drawing conclusions.

 

The unique weakness of Bigfoot sighting databases, and "paranormal" data in general, is that we don't know to what degree the body of data has been tainted by heading or systematic misperception.  This is where mathematical methods come into play.

 

The best mathematical analysis of a Bigfoot sighting database that I am aware of is the analysis published by Glickman in 1998.  The basic thrust of the method is that any person, in any place, at any time, is capable of submitting a hoaxes report.  For this reason, the rate of hoaxing should be proportional to the human population.  Similarly, if a Bigfoot sighting report is prompted by an encounter with an actual animal, then the more densely packed the people are in a given area, the more likely one or more of those people would be to encounter an animal residing within that area.  Consequently, the rate of sighting reports of an actual animal should be proportional to the human population density.

 

Using these two principles, Glickman analyzed the distribution of reports at the state level.  He used a hierarchical cluster analysis (a method used to determine the similarity of individual pieces of data within a dataset) to divide the report database into two groups.  Group A covered Alaska, Washington, Oregon, Northern California, Idaho, and Montana.  Group B covered everything else.

 

He found that in Group A, the number of reports per state showed a correlation with both the population and population density.  The conclusion is that reports in these states are a result of a combination of hoaxing and actual animal sightings.  On the other hand, Group B showed only a correlation with population, the conclusion being that reports in the rest of the country are solely due to hoaxing.

 

In 2005 and 2006 I extended Glickman's analysis to check for correlations between the number of Bigfoot reports per state and the black and brown bear population densities in those states.  There was no correlation with black bears in either Group A or B, and actually a negative correlation with brown bears in Group A.  My conclusion was that we could eliminate the only two animals that could reasonably be mistaken for Bigfoot, and speculate that Bigfoot sightings really are due to an uncataloged animal species.  Moreover, I concluded based on the brown bear data that Bigfoot and grizzly bears are natural enemies.

 

Recently I've been updating all of these analyses based on the BFRO database instead of the outdated Green data that both Glickman and I had previously used.  The BFRO database also allowed me to use Canada in the analysis.  I should be finished with the analysis and write up shortly.

Great post.

I look forward to reading your report.

5 minutes ago, Cryptic Megafauna said:

Great post.

I look forward to reading your report.

I would add that the false reporting in other parts of the country may not be hoaxing as much as fantasy, wishful thinking, mixed in with misidentification.

The common drive would be social media, the internet, and the increasing popularization of the subject and the subsequent lowering of the bar for reporting driven by aficionados, storytellers, and special interet groups including for profit groups such as paid summits with celebrity lecturers, books, films, etc., etc.

Edited by Cryptic Megafauna
Link to comment
Share on other sites

Admin

^^^  That is why you need to vet the data first. Eliminate obvious hoaxes, misids, implausible reports, etc.

 

You need to sanitize the data first, like what we do with the SSR...

 

Link to comment
Share on other sites

  • 4 weeks later...

Short summary of my conclusions:

 

Yukon Territory, Alaska, British Columbia, Northern California, Montana, Washington, Oregon, Manitoba, and Wyoming show multiple effects contributing to the Bigfoot phenomenon in these states.  Hoaxing is present to a much higher degree here than in the rest of North America.  Misidentification of black bears is another major contributor in these states.  These two tendencies may be due to geographically-biased cultural effects (i.e., a general belief that this region is the "Bigfoot territory").  The remainder of the reports may be due to an uncataloged species.

 

Idaho, Alberta, Colorado, New Mexico, Texas, Utah, Ontario, Oklahoma, Arkansas, South Dakota, Saskatchewan, Missouri, and Arizona are very interesting.  Hoaxing here is only present at the same rate as in the remainder of North America.  Misidentification of black bears does not contribute significantly to Bigfoot sighting reports in these states.  Nearly all non-hoaxed reports here are likely due to an uncataloged species.

 

The remainder of North America shows only contributions from hoaxing at this level of analysis.  No correlations evidencing a contribution from sightings of animals, cataloged or uncataloged, were found in the data.

 

One recommendation that comes from this research is that we should actually be devoting most of our efforts, insofar as report collection and analysis of specific details in reports are concerned, on the second set of states, outside of the "core" region consisting of the first set of states.  This is due to a lower rate of hoaxing and absence of black bear misidentification.

 

I hope to go deeper into this analysis in the coming weeks.

  • Upvote 2
Link to comment
Share on other sites

4 hours ago, Mendoza said:

The remainder of North America shows only contributions from hoaxing at this level of analysis.  No correlations evidencing a contribution from sightings of animals, cataloged or uncatalogued, were found in the data.

 

Here's where you lost me. You're basically saying that all reports in Minnesota and Iowa are hoaxes because they fall into group B. I disagree. (I don't care as much but the logic also applies to Michigan, Illinois, Ohio, and Wisconsin)

While the methods you used might be sound, the data really needs to be adjusted. Here a few reasons I think the conclusions are off.

 

1. There are many encounters in Iowa that remain out of the public BFRO data base because they'd give up expedition locations. You'll see some of them when Finding Bigfoot Iowa gets aired. I can think of 7 or 8 encounters that happened while I was present, that would all be good enough to make it into the database (knocks, whoops with audio, and class B sightings which as shown in your report cannot really be attributed to black bear).  In Minnesota, the BFRO presence has dropped off to form other groups that are taking reports and holding expeditions (one of which I am a part of). Reports that they receive don't make it into the BFRO database. When Finding Bigfoot rolled into Minnesota they looked to at least one of these other groups for potential locations and witnesses.

 

2. If your conclusions are somehow based on population, it needs some tweaking. The residents of Des Moines and Minneapolis/St Paul make up a large population percentage of both states, but the sightings are (generally) far away from those big cities. If you want to conclude that Des Moines and Minneapolis suburban sightings are hoaxes I can live with that. Take away those big cities and your population density for the rest of the state falls off greatly I'd suspect, and mostly likely changes the conclusions. (same argument for Illinois, Ohio, Michigan, and Wisconsin)

 

3. I think you somehow need to account for percentage of forest each state has. Iowa has more forest than you'd suspect, but I'd guess it's a pretty low percentage of total area compared to come other states. I'd like to know how the sightings per square mile of forest works out for all the states. (I may look into that actually) That might work in Iowa's favor but hurt Minnesota.

 

4. South Dakota as an A' while Iowa, Minnesota, Ohio, Illinois, and Wisconsin are all 'probable hoaxes'  is a huge red flag. Sioux Falls must not be large enough.

 

There's probably more and I don't have the know-how to figure out exactly how this all fits in. I just know the initial conclusions don't fit my view of reality.

 

Edit: I left off rain totals, which I'm pretty certain matter greatly...

Edited by Redbone
  • Upvote 4
Link to comment
Share on other sites

Guest Cryptic Megafauna

The only forests deep in cover, steep enough, remote enough, enough rainfall, enough food, to contain a large captured bipedal great ape,

that also has extensive oral history back to prehistoric times is....

 

My backyard in the suburbs, really!

Or maybe my attic, where the bats and the antelopes play.

I'm so confused.

 

In reality it is the Pacific mountainous coast rain-forests that stretch up to Alaska.

 

Perhaps Tibet, Siberia, Indonesia.

 

Interestingly also places that Homo Erectus was known to have spread.

 

 

 

 

Link to comment
Share on other sites

On 12/26/2016 at 3:35 PM, Mendoza said:

Attached is the promised paper detailing the updated statistical analysis I mentioned in my previous post in this thread.

bigfootupdatedanalysispaper.pdf

 

Interesting study. But I am with Redbone on the conclusions from the data used. 

I will use the state of Washington in your core area for example. A human population density of +101 /sq mi gives a very skewed view of the state as a whole. Half the population lives in only three counties of the 39 in the state. If you go down to the county level using Skamania county for an example (one of the highest sighting areas in the state) the population density is <7. That is a huge difference from the stated value for the state. That figure is closer to the reality of what is seen in the remote areas of the state even during the summer when outdoor recreation is at a high. The huge majority of people in the higher population areas will never leave those areas seasonally to increase the population densities in the more remote places. The same holds true in Oregon as well. Some of them might go to the beach during the summer or to a ski resort in the winter but an occasional camping trip or a drive down one of the interstates is the most influx you will see. Main arterial usage numbers would probably be a better data set to use, as Explorer has done with his studies. 

 

The bear population is skewed in the other direction. Because the state of WA is not all bear habitat. Only about half of it is. Therefore doubling the bear population in actual available habitat. Again Oregon is about the same.

 

As Redbone mentioned there is a big difference between the urban and rural population. And available habitat should be considered. This is going to change the analysis and the conclusions. 

  • Upvote 2
Link to comment
Share on other sites

Mendoza,

Thanks for your efforts and applying statistical analysis to Green’s database.

Below are some general observations/comments.

I don’t quite follow the premises assumed for your conclusions:

1)      That correlation to population is suggestive of fabricated reports.

2)      That correlation to population density is consistent with the model of receiving a report of an animal.

3)      That correlation between black bear population density and report frequency is the expected result if misidentification of black bears is a significant contributor to the Bigfoot phenomenon.

 

On premise #1, I would imagine that states with higher human population will yield higher number of hoaxes (there is always a % of those).  But, I would also imagine that if BF was a real creature, that states (that contain BF habitat) with higher human populations will also yield higher number of reports.  Thus, I would expect some correlation between frequency of BF reports and human population, and this would not necessarily imply hoax.

On premise #2, I also agree with BigTreeWalker, in that human population density should be more granular (maybe by county instead of state) in order to reflect the true low population density of those places in Northern CA, OR, and WA that have higher BF frequency counts.   I am curious if you find the opposite results (once adjusting for human population by county) whereas there is a negative correlation between BF report frequency and high human population density.  That is what I would expect.  For example, I would expect that Del Norte County and Tuolumne County in California will have more BF reports than West Contra Costa County, despite WCC County having higher human population density.

On premise #3, I always thought that BF sighting reports will be positively correlated with Black Bear population density because they share the same habitat.  My rule of thumb has been: if there is bear and deer population present, then there is a higher probability of finding BF present.  Thus, a positive correlation between frequency BF sighting reports and black bear density does not suggest or imply misidentification of BF as black bear.

I will love to see your work applied to PNW states by county and see what you find.  Instead of differentiating (testing the different hypotheses) by states, you differentiate by counties.  We can still learn much from this effort.  Also, you might be able to check if any BF reports are present in counties that have zero black bear populations (if they exist, I have not checked).  In my mind, if there is no bear habitat present, then the report is more likely to be a hoax. 

  • Upvote 1
Link to comment
Share on other sites

19 hours ago, Redbone said:

 

Here's where you lost me. You're basically saying that all reports in Minnesota and Iowa are hoaxes because they fall into group B. I disagree. (I don't care as much but the logic also applies to Michigan, Illinois, Ohio, and Wisconsin)

While the methods you used might be sound, the data really needs to be adjusted. Here a few reasons I think the conclusions are off.

 

1. There are many encounters in Iowa that remain out of the public BFRO data base because they'd give up expedition locations. You'll see some of them when Finding Bigfoot Iowa gets aired. I can think of 7 or 8 encounters that happened while I was present, that would all be good enough to make it into the database (knocks, whoops with audio, and class B sightings which as shown in your report cannot really be attributed to black bear).  In Minnesota, the BFRO presence has dropped off to form other groups that are taking reports and holding expeditions (one of which I am a part of). Reports that they receive don't make it into the BFRO database. When Finding Bigfoot rolled into Minnesota they looked to at least one of these other groups for potential locations and witnesses.

 

2. If your conclusions are somehow based on population, it needs some tweaking. The residents of Des Moines and Minneapolis/St Paul make up a large population percentage of both states, but the sightings are (generally) far away from those big cities. If you want to conclude that Des Moines and Minneapolis suburban sightings are hoaxes I can live with that. Take away those big cities and your population density for the rest of the state falls off greatly I'd suspect, and mostly likely changes the conclusions. (same argument for Illinois, Ohio, Michigan, and Wisconsin)

 

3. I think you somehow need to account for percentage of forest each state has. Iowa has more forest than you'd suspect, but I'd guess it's a pretty low percentage of total area compared to come other states. I'd like to know how the sightings per square mile of forest works out for all the states. (I may look into that actually) That might work in Iowa's favor but hurt Minnesota.

 

4. South Dakota as an A' while Iowa, Minnesota, Ohio, Illinois, and Wisconsin are all 'probable hoaxes'  is a huge red flag. Sioux Falls must not be large enough.

 

There's probably more and I don't have the know-how to figure out exactly how this all fits in. I just know the initial conclusions don't fit my view of reality.

 

Edit: I left off rain totals, which I'm pretty certain matter greatly...

 

This is an excellent critique that sheds light on some of the limitations of my analysis.  I would have replied earlier, but I'm still under the one post per day newbie probation.

 

I should first say, however, that you've paraphrased my conclusions to the effect that I've concluded that the reports from Group B are all hoaxes.  This isn't exactly what I said.  I tried to choose my words carefully (though perhaps not carefully enough), but overall it was not my intent to claim anything so definitive.  You'll see that my conclusions have been tempered with words like "may" and "no evidence."  As you know, absence of evidence is not evidence of absence.  I've merely found no statistical evidence in this particular analysis for non-hoaxed reports in Group B; this doesn't mean that all reports here are hoaxes, only that my analysis has not been able to identify a set of non-hoaxed reports from these states by their telltale mathematical relationships.  Some of the reasons for this are those you've pointed out already, which I'll now address one at a time.

 

1.  It sounds like you're describing relatively recent developments in Iowa and Minnesota, whereas the BFRO database is over a decade old.  Green's database, used by Glickman in his analysis which largely agrees with mine, goes back even further.  I would expect the impact of what sounds like recent developments in report publication protocol and submission channels to be minimal.  Still, I agree that it's important to be aware of localized biases in the data that could skew an analysis, so your point is appreciated.

 

2.  This point is spot-on.  Since all of the calculations were done at the state/province level, this can be considered the effective "resolution" of the present analysis.  Urban population concentrations and localized Bigfoot reporting "hot spots" are smeared out over entire states.  What I'm offering, then, is a blurry picture:  some of Bigfoot's range is going to be smudged over a wider area than probably exists in reality, while other parts of the range may be smudged out entirely.  Still, we have a picture, blurry as it is.  A higher-resolution analysis will produce a clearer picture, and something may well appear in the states whose omission from Bigfoot's range you've objected to.

 

A good analogy is the Dawn spacecraft's approach to the dwarf planet Ceres.  From Earth, our best telescopes indicated a bright spot, but not much else.  This is analogous to the state of Bigfoot research in years past: we knew there was something there, and it seemed to be centered on the Pacific Northwest, but that was about it.  As Dawn got closer, we saw the spot resolved into two smaller, very bright spots in a crater.  Hints of a few other bright spots also appeared.  This is right about where I would say my analysis puts us, with the emergence of the newly identified Group A'.  Finally, Dawn arrived in orbit around Ceres and determined that one of the two bright spots was a single circular feature, while the second was actually the sum of several spots of varying sizes.  At the same time, numerous similar bright spots are now known to pepper Ceres.  This is where I hope our knowledge of Bigfoot's range will be in a few years.  Then, we can start figuring out what the bright spots are actually made of--figuring out what this Bigfoot thing really is.

 

3.  This point assumes that forest is the preferred habitat of Bigfoot.  While this is probably a reasonable guess, I'm not willing to simply admit it into the analysis as an assumption.  I've tried to keep a priori assumptions about the nature of Bigfoot to a minimum.  You know about "garbage in, garbage out."  If there's something wrong with the assumption (for example, maybe Bigfoot can handle both forests and swamps, or maybe Bigfoot only does coniferous forests), we'll just get our wrong assumption back in the form of a flawed conclusion.

 

That said, an analysis such as the one you propose might give us some information about whether or not forest actually is Bigfoot's primary habitat.  It would be worth looking into.

 

4.  South Dakota coming out in Group A' was a surprise to me as well, but I'm just reporting what the math tells me.  Is anyone here familiar with the natural environment of South Dakota?

 

Hopefully this reply has mostly covered BigTreeWalker's points as well.  Explorer raised a different set of issues which I'll try to address about 24 hours from now.  (This forum must have gotten some serious trolls in my absence for them to implement such a strict probationary period.)

Link to comment
Share on other sites

I'm attaching a second version of my paper, with a major addition.  There is a new section showing an analysis of Bigfoot sighting data and brown bear population data, and a new paragraph and some other additions in the conclusion.  Some typographical errors have also been corrected.

 

I hadn't attempted an updated analysis with brown bear data before because, given the difficulty I had just finding data for the United States back in 2006, I figured the Canadian data I would need for this updated analysis wouldn't be available.  Little did I know that Canada has done a better job of coordinating efforts to track brown bear population data than their American counterparts.  That data has now been incorporated into my paper.

bigfootupdatedanalysispaper-v2.pdf

Link to comment
Share on other sites

In the lower 48, unless you're in Idaho, Montana or Wyoming (especially Yellowstone) your chances of seeing a grizzly (brown bear) are probably lower than seeing an actual sasquatch elsewhere, according to sighting reports. ;)

  • Upvote 1
Link to comment
Share on other sites

 
Idaho, Alberta, Colorado, New Mexico, Texas, Utah, Ontario, Oklahoma, Arkansas, South Dakota, Saskatchewan, Missouri, and Arizona are very interesting.  Hoaxing here is only present at the same rate as in the remainder of North America.  Misidentification of black bears does not contribute significantly to Bigfoot sighting reports in these states.  Nearly all non-hoaxed reports here are likely due to an uncataloged species.
---------------------------------
 
Why does Mis ID of black bears not contribute greatly in Idaho or Colorado vs. Washington or Montana?
 
That does not make any sense.
  • Upvote 1
Link to comment
Share on other sites

  • gigantor unlocked this topic
×
×
  • Create New...