C Ramu sirw

News! GO-terms

GO terms are searchable now with swissprot-release database. GO-categories are recognizable now and note that the go terms are parsed as words!.

Introduction

SIRW is a web interface to SIR (Simple Indexing and Retrieval System). It combines the ability to search protein/nucleotide databases with keywords and a sequence motif. SIRW framework allows to plug-in sequence analysis module directly. Any python module that does something with the sequence is an apropriate one to include.

SIRW supports the following databases

Keyword searching

SIRW lists all the entry fields for the database you selected. A keyword can be typed or selected through selection box and checkbox. The method names are listed at the end of the input form. Right now the only method is pattern searching (motif detection). This method becomes part of the database search field. The search algorithm works like as follows,

   for all the entries found through keywords
       display the selected fields
       apply method if selected
         if method has result
           display result
   end

User can use wildcard '*' (e.g. finc_*) with the keyword. You can combine keywords like ' dna & binding'. The cross references for an entry is hypertext linked.

The method pattern has 'alignformat' style to display the result. This option shows as though the found sequence motifs are aligned. This is useful input for multiple sequence alignment programs to further refine a motif.

There will be more methods in the future and I wellcome your suggestions.

Scanning through the indices

When you click on the database name, you get the indexing browsing page. The following is an example of prosite database browse index page.

    
FieldName Long Name Key Length Field Type Indexed Index Date
id Identification 1564 primary Ya Tue Jan 14 10:43:02 2003
acc Accession 1566 string Ya Tue Jan 14 10:43:04 2003
type Type 3 string Ya Tue Jan 14 10:43:04 2003
taxo Taxo 22 string Ya Tue Jan 14 10:43:04 2003
pmod pmod 1 string Ya Tue Jan 14 10:43:04 2003
ftkey ftkey 5 string Ya Tue Jan 14 10:43:04 2003
3d 3d 0 string No No not indexed
doc doc 0 string No No not indexed

The highlighted fields leads to where you can type in your keywords.

FieldName Long Name Key Length Field Type Indexed Index Date
id Identification 1564 primary Yes Tue Jan 14 10:43:02 2003
Browse with key occurring atleast times

Motif pattern making and searching

For the pattern or subsequence to be searched for the entries that are selected with keyeords, you need to make up, specify the patterns. The motif patterns can be described with regular expression. SIRW need the regular expressions that are POSIX standard. For a detailed description about how to make a motif pattern refer

For making a pattern from an aligned set of sequences go to

With SIRW, the pattern field becomes an extend field of the selected database entry. The following is the truncated result of a query with the keywords 'nuclear & receptor' in the swissprot description field and the result is searched for the nuclear receptor box motif "LXXLL" which is "L..LL" (POSIX std).

    
Query = 'find('swissprot','des',"nuclear") & find('swissprot','des',"receptor")'

1: nrh3_mouse/433-445   VFALRLQDKK   LPPLL   SEIWDVHE
1: nco2_rat/641-655   RLHESKGQTK   LLQLL   TTKSDQMEPS
2: nco2_rat/690-704   GTSLKEKHKI   LHRLL   QDSSSPVDLA
3: nco2_rat/745-759   ASPKKKENAL   LRYLL   DKDDTKDIGL
4: nco2_rat/878-892   STFNNPRPGQ   LGRLL   PNQNLPLDIT
And without alignformat the output would look like (here the description field is selected for display)
Query = 'find('swissprot','des',"nuclear") & find('swissprot','des',"receptor")'

swissprot:nrh3_mouse

DE Oxysterols receptor LXR-alpha (Liver X receptor alpha) (Nuclear orphan DE receptor LXR-alpha).
ID nrh3_mouse STANDARD; PRT; 445 AA. SQ SEQUENCE 445 AA; MSLWLEASMP DVSPDSATEL WKTEPQDAGD QGGNTCILRE EARMPQSTGV ALGIGLESAE PTALLPRAET LPEPTELRPQ KRKKGPAPKM LGNELCSVCG DKASGFHYNV LSCEGCKGFF RRSVIKGARY VCHSGGHCPM DTYMRRKCQE CRLRKCRQAG MREECVLSEE QIRLKKLKRQ EEEQAQATSV SPRVSSPPQV LPQLSPEQLG MIEKLVAAQQ QCNRRSFSDR LRVTPWPIAP DPQSREARQQ RFAHFTELAI VSVQEIVDFA KQLPGFLQLS REDQIALLKT SAIEVMLLET SRRYNPGSES ITFLKDFSYN REDFAKAGLQ VEFINPIFEF SRAMNELQLN DAEFALLIAI SIFSADRPNV QDQLQVERLQ HTYVEALHAY VSINHPHDRL MFPRMLMKLV SLRTLSSVHS EQVFALRLQD KKLPPLLSEI WDVHE //
Pattern (?:L..LL) Overlap : None 1: (433 , 437) VFALRLQDKK LPPLL SEIWDVHE
swissprot:nr42_rat
DE Orphan nuclear receptor NURR1 (NUR-related factor 1) (Regenerating DE liver nuclear receptor 1) (RNR-1) (SL-322) (Nuclear orphan receptor DE HZF-3).
ID nr42_rat STANDARD; PRT; 598 AA. SQ SEQUENCE 598 AA; MPCVQAQYGS SPQGASPASQ SYSYHSSGEY SSDFLTPEFV KFSMDLTNTE ITATTSLPSF STFMDNYSTG YDVKPPCLYQ MPLSGQQSSI KVEDIQMHNY QQHSHLPPQS EEMMPHSGSV YYKPSSPPTP STPGFQVQHS PMWDDPGSLH NFHQNYVATT HMIEQRKTPV SRLSLFSFKQ SRPGTPVSSC QMRFDGPLHV PMNPEPAGSH HVVDGQTFAV PNPIRKPASM GFPGLQIGHA SQLLDTQVPS PPSRGSPSNE GLCAVCGDNA ACQHYGVRTC EGCKGFFKRT VQKNAKYVCL ANKNCPVDKR RRNRCQYCRF QKCLAVGMVK EVVRTDSLKG RRGRLPSKPK SPQDPSPPSP PVSLISALVR AHVDSNPAMT SLDYSRFQAN PDYQMSGDDT QHIQQFYDLL TGSMEIIRGW AEKIPGFADL PKADQDLLFE SAFLELFVLR LAYRSNPVEG KLIFCNGVVL HRLQCVRGFG EWIDSIVEFS SNLQNMNIDI SAFSCIAALA MVTERHGLKE PKRVEELQNK IVNCLKDHVT FNNGGLNRPN YLSKLLGKLP ELRTLCTQGL QRIFYLKLED LVPPPAIIDK LFLDTLPF //
Pattern (?:L..LL) Overlap : None 1: (552 , 556) NNGGLNRPNY LSKLL GKLPELRTLC
© Copyright   Chenna Ramu, EMBL, Heidelberg, Germany


Contact: Chenna Ramu, EMBL, Heidelberg, Germany.