Difference between revisions of "Search"

From Freephile Wiki
Jump to navigation Jump to search
(fixes link, adds Lucene info)
(fix image)
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{Template:Categorypage}}
+
{{Template:Categorypage|Search}}
 
== User Aids ==
 
== User Aids ==
 
=== Searching the Web ===
 
=== Searching the Web ===
Line 5: Line 5:
 
# http://www.googletutor.com/ Google Tutor helps you learn and understand Google
 
# http://www.googletutor.com/ Google Tutor helps you learn and understand Google
 
# http://www.googleguide.com/ Google Guide helps you learn and understand Google
 
# http://www.googleguide.com/ Google Guide helps you learn and understand Google
 +
 +
=== Semantic Web Search ===
 +
[[Image:Semantic Web Search Logo.png|thumb]]
 +
# http://swoogle.umbc.edu Semantic Web Search
  
 
=== Searching for Multimedia ===
 
=== Searching for Multimedia ===
When searching for unrestricted graphics content, it is hard to beat the huge commons of Wiki commons. Use the search engine on toolserver.org to find the images or other media you're looking for.  http://toolserver.org/~tangotango/mayflower/ Any image found there can be used under the terms of the (creative commons) license listed -- meaning it can be used here or on your website.
+
Want free graphics (as in Libre Graphics)?
 +
 
 +
When searching for unrestricted graphics content, it is hard to beat the huge commons of [https://commons.wikimedia.org/wiki/Main_Page WikiMedia Commons].  Any image found there can be used under the terms of the (creative commons) license listed -- meaning it can be used here or on your website.
 +
 
 +
Flickr also has millions of images licensed under creative commons licenses: https://www.flickr.com/creativecommons/
  
 
=== Native (Application) Search ===
 
=== Native (Application) Search ===
Applications such as this wiki (mediawiki), and CMS systems (e.g. Drupal) obviously know their own content.  So, if you are looking for something and want the best results for those applications, you should make use of the direct search facilities in the application.  Note that this wiki and the CMS systems also provide an 'OpenSearch' implementation that lets you use your browser's search toolbar to directly search these applications.
+
Applications such as this wiki (runs on MediaWiki), and CMS systems (e.g. Drupal) obviously know their own content.  So, usually it would suffice to make use of the search facilities built in to the application.  However, this doesn't always ring true -- especially when you consider that search as a service in it's own right is probably more powerful than search as a "feature" that is independently tacked on to each application in your stack.
  
# [[mw:Search]] helps you learn and understand the search capabilities of this system
+
There is a series of articles about the introduction of Full Text Search (FTS) in InnoDB engine for MySQL 5.6 at https://www.percona.com/blog/2013/02/26/myisam-vs-innodb-full-text-search-in-mysql-5-6-part-1/
# The Lucene backend used on Wikipedia [[mw:Extension:Lucene-search]] can be used for large-scale installations where the built-in search is not sufficient.  Note that the simplest enhancement you can make to a small-scale installation is to tweak the MySQL stopwords and word-length.
+
 
 +
Users and Implementors of MediaWiki, see [[MediaWiki/Search]].
  
 
== General ==
 
== General ==
 +
[[File:Apache Solr Front Cover.jpg|200px|right|reviewed by Greg Rundlett|link=https://books.google.com/books?id=9jByAgAAQBAJ&pg=PT24&source=gbs_toc_r&cad=3#v=onepage&q&f=false]]
 
Google offers a service called the [http://www.google.com/coop/cse/ Google Custom Search Engine].  The Google CSE is much like the 'normal' Google, but is configured to include only domains that you want.  Additionally, the domains can be grouped into 'realms' that can be used to assist the user to find content according to functional area.
 
Google offers a service called the [http://www.google.com/coop/cse/ Google Custom Search Engine].  The Google CSE is much like the 'normal' Google, but is configured to include only domains that you want.  Additionally, the domains can be grouped into 'realms' that can be used to assist the user to find content according to functional area.
  
Line 25: Line 35:
 
# The index will not allow custom data formats or indexes that you create... it's Google's algorithms for better or for worse.
 
# The index will not allow custom data formats or indexes that you create... it's Google's algorithms for better or for worse.
  
To meet these needs, use a product like [[mnoGoSearch]] [[mw:Apache_Solr]] or [[mw:Nutch]] which you are free to install and configure to suit your requirements.
+
To meet these needs, use a product like [[mnoGoSearch]], [[mw:Apache Solr|Apache Solr]] or [[mw:Nutch]] which you are free to install and configure to suit your requirements.
  
See [[wp:Category:Internet_search_engines]] for a list of search engine solutions.
+
See [[wp:Category:Internet search engines|Category:Internet search engines]] for a list of search engine solutions.
  
 
== Editors ==
 
== Editors ==
Line 34: Line 44:
  
 
== Developers ==
 
== Developers ==
 +
=== Search your code.  Can you 'grok' it? ===
 +
[[File:Opengrok-analysis.png|right]]
 +
LXR The [http://lxr.linux.no/ Linux Cross Reference]  is probably the first widely used web-based code cross-reference tool.  Along came [http://opengrok.github.io/OpenGrok/ OpenGrok] which started out as a project at Sun (which was bought by Oracle) and now the project lives on its own in the open.  OpenGrok is '''lightening fast''' and is actively maintained as an open source project on GitHub.  By the way, the underlying search is powered by SOLR.  Meanwhile, [http://kohsuke.org/ Kohsuke Kawaguchi] the magic man behind Jenkins (n�e Hudson), also wrote [http://sorcerer.jenkins-ci.org/ Sorceror] which understands semantics in Java.  Sadly, Sorceror code hasn't been touched in 4 years and doesn't seem to be an active project - but for Java codebases, it's probably still a good option.
 +
 +
=== Browser extensions / Web Apps ===
 
* [[wp:OpenSearch]] is a standard for building search plugins, and exposing search services.  It is how Firefox is able to hook up with wikipedia to offer suggestions '''as you type''' in the wikipedia search toolbar within Firefox.  It is also how this site tells the browser that it offers a search toolbar plugin.
 
* [[wp:OpenSearch]] is a standard for building search plugins, and exposing search services.  It is how Firefox is able to hook up with wikipedia to offer suggestions '''as you type''' in the wikipedia search toolbar within Firefox.  It is also how this site tells the browser that it offers a search toolbar plugin.
 
# Visit this site http://freephile.com/wiki/
 
# Visit this site http://freephile.com/wiki/
Line 50: Line 65:
 
That script (/wiki/opensearch_desc.php) generates xml output that the browser can interpret
 
That script (/wiki/opensearch_desc.php) generates xml output that the browser can interpret
  
On wikipedia, they use a slight improvement that offers the additional suggest as you type feature
+
== Search Engine Optimization (SEO) ==
 +
How do you make your website Search-engine friendly?  How do you get 'organic' traffic from Google?  It's all about [[SEO]].
  
 
[[Category:Help]]
 
[[Category:Help]]

Latest revision as of 09:08, 17 August 2018


User Aids[edit | edit source]

Searching the Web[edit | edit source]

  1. 'Google is your friend' tm, and the 'Customize Google' tool (see Browser Extensions) does all kinds of things to help you search using Google, AND easily opens up the world of search engines by letting you replicate your search across other engines in a click.
  2. http://www.googletutor.com/ Google Tutor helps you learn and understand Google
  3. http://www.googleguide.com/ Google Guide helps you learn and understand Google

Semantic Web Search[edit | edit source]

Semantic Web Search Logo.png
  1. http://swoogle.umbc.edu Semantic Web Search

Searching for Multimedia[edit | edit source]

Want free graphics (as in Libre Graphics)?

When searching for unrestricted graphics content, it is hard to beat the huge commons of WikiMedia Commons. Any image found there can be used under the terms of the (creative commons) license listed -- meaning it can be used here or on your website.

Flickr also has millions of images licensed under creative commons licenses: https://www.flickr.com/creativecommons/

Native (Application) Search[edit | edit source]

Applications such as this wiki (runs on MediaWiki), and CMS systems (e.g. Drupal) obviously know their own content. So, usually it would suffice to make use of the search facilities built in to the application. However, this doesn't always ring true -- especially when you consider that search as a service in it's own right is probably more powerful than search as a "feature" that is independently tacked on to each application in your stack.

There is a series of articles about the introduction of Full Text Search (FTS) in InnoDB engine for MySQL 5.6 at https://www.percona.com/blog/2013/02/26/myisam-vs-innodb-full-text-search-in-mysql-5-6-part-1/

Users and Implementors of MediaWiki, see MediaWiki/Search.

General[edit | edit source]

reviewed by Greg Rundlett

Google offers a service called the Google Custom Search Engine. The Google CSE is much like the 'normal' Google, but is configured to include only domains that you want. Additionally, the domains can be grouped into 'realms' that can be used to assist the user to find content according to functional area.

An example implementation can be seen at the OASIS search site http://search.oasis-open.org

The main limitations of the Google CSE are that

  1. The content must be public (can't be used for your private Intranet)
  2. The index will not cover an unlimited amount of content
  3. The index will not allow custom data formats or indexes that you create... it's Google's algorithms for better or for worse.

To meet these needs, use a product like mnoGoSearch, Apache Solr or mw:Nutch which you are free to install and configure to suit your requirements.

See Category:Internet search engines for a list of search engine solutions.

Editors[edit | edit source]

  1. the Template:Search template helps you insert a 'search' link in wiki pages for searching this wiki, like this This template doesn't work
  1. There is the 'google' compact uri syntax that allows you to insert links in your wiki pages for queries on google.com like this google:bananas See Namespaces for the full list of compact URI tools including c2find, cache, dejanews, dictionary and rfc.

Developers[edit | edit source]

Search your code. Can you 'grok' it?[edit | edit source]

Opengrok-analysis.png

LXR The Linux Cross Reference is probably the first widely used web-based code cross-reference tool. Along came OpenGrok which started out as a project at Sun (which was bought by Oracle) and now the project lives on its own in the open. OpenGrok is lightening fast and is actively maintained as an open source project on GitHub. By the way, the underlying search is powered by SOLR. Meanwhile, Kohsuke Kawaguchi the magic man behind Jenkins (n�e Hudson), also wrote Sorceror which understands semantics in Java. Sadly, Sorceror code hasn't been touched in 4 years and doesn't seem to be an active project - but for Java codebases, it's probably still a good option.

Browser extensions / Web Apps[edit | edit source]

  • wp:OpenSearch is a standard for building search plugins, and exposing search services. It is how Firefox is able to hook up with wikipedia to offer suggestions as you type in the wikipedia search toolbar within Firefox. It is also how this site tells the browser that it offers a search toolbar plugin.
  1. Visit this site http://freephile.com/wiki/
  2. notice that the icon in the Firefox search toolbar turns bluish
  3. click the little icon to reveal that there is a plugin available that will 'Add Freephile Wiki (En)'

Opensearch Implementation example[edit | edit source]

Opensearch is implemented in the MediaWiki system.

Viewing source on a page, you will see the following element:

<xml>
<link rel="search" type="application/opensearchdescription+xml" href="/wiki/opensearch_desc.php" title="{{SITENAME}} (English)" />
</xml>

That script (/wiki/opensearch_desc.php) generates xml output that the browser can interpret

Search Engine Optimization (SEO)[edit | edit source]

How do you make your website Search-engine friendly? How do you get 'organic' traffic from Google? It's all about SEO.