Taoffi's blog

prisonniers du temps

doc5ync – word index web page presentation!

Objectives:

  • Display a cloud of index words, each dimensioned relative to its occurrences in e-book information (e-book title, description, author, editor… etc.)
  • On selection of a given word: display the list of e-book references related to the selected word
  • On selection of an e-book: display its information details and all words linked to it

Context:

doc5ync web interface is based on a meta-model engine (simpleSite, currently being renamed to web5ync!).

I talked about meta-models in a past post Here, with some posts about its potential applications here.

The basic concept of meta-models is to describe an object by its set of properties and enable the user to act on these properties by modifying their values in the meta-model database. On runtime, those property values are assigned to each defined object.

In our case, for instance, we have a meta-model describing web page elements, and a meta-model describing the dataset of word index and their related e-books.

For web page elements, the approach considers a web page as a set of html tags (i.e. <div>, <table><tr><td>…, <img… etc.). Where each tag has a set of properties (style, and other attributes) for which you can define the desired values. On runtime, your meta-model-defined web page comes to life by loading its html tags, assigning to each the defined values and injecting the output of the process to the web response.

A dataset is similarly considered as a set of rows (obtained through a data source), each composed of data cells containing values. Data cells can then be either presented and manipulated through web elements (html tags, above) or otherwise manipulated through web services.

Data storage and relationships

As we mentioned in the previous post, index words and their related e-books are stored in database tables as illustrated in the following figure:

Each word of the index provides us with its number of occurrences in e-book text sequences (known on Trie scan).

Html formatting using a SQL view

To reflect this information into a presentation, we used a view to format a html div element for each word relative to its number of occurrences. The query looks like the following code

select
  w.id            as word_id
 , w.n_occurs
 , N'<div style="BASIC STYLE STRING HERE…; display:inline;'

-- add the font-size style relative to number of occurrences
+
case
    when w.n_occurs between 0 and 2    then N' font-size:10pt;'
    when w.n_occurs between 3 and 8    then N' font-size:14pt;'
    when w.n_occurs between 9 and 15    then N' font-size:16pt;'
    when w.n_occurs between 16 and 24    then N' font-size:22pt;'
    when w.n_occurs between 25 and 2147483647    then N' font-size:26pt; '
end

+ N'"'
-- add whatever html attributes we need (hover/click…)
+
N' id="div' + convert(nvarchar(32), w.id)
+ N'" onclick="select_data_cell(''' + convert(nvarchar(32), w.id) + N''');" '
     as word_string_html

-- add other columns if needed
from dbo.doc5_trie_words w
order by w.word_string

 

The above view code provides us with html-preformatted string for each word index in the data row.

Tweaking the data rows into a cloud of words

On a web page, a dataset is commonly displayed as a grid (table / columns / rows), and web5ync knows how to read a data source, and output its rows into that form. But that did not seem to be convenient in our case, because it simply displays index words each on a row which is not really the presentation we are looking for!

 

To resolve this, we simply need to change the dataset web container from a <table> (and its containing rows / cells) into <div> tags (with style=display: inline).

Here a sample of html code of the above presentation:

<table>
  <tr>
    <td>
    <div style="font-size:16pt;" onclick="select_data_cell('29008');">After</div>
  </td>
</tr>
<tr>
  <td>
    <div style="font-size:26pt; onclick="select_data_cell('28526');">after</div>
  </td>
</tr>

<!-- the table rows go on... --> 


And here is a sample html code of the presentation we are looking for:

<div style="display:inline;">
    <div id="td_word_string_html82" style="display:inline;">
       <div style="font-size:26pt;" onclick="select_data_cell('28526');">after</div>
      </div>
</div>

<div style="display:inline;">
    <div id="td_word_string_html83">
      <div style="font-size:16pt;" onclick="select_data_cell('29008');">After</div>
    </div>
</div>

<div style="display:inline;">
   <div style="display:inline;">
      <div style="font-size:10pt;" onclick="select_data_cell('17657');">AFTER</div>
    </div>
</div>

 

Which looks closer to what we want:

Interacting with index words

The second part of our task is to allow the user to interact with the index words: clicking a word = display its related e-books, clicking an e-book = display the e-book details + display index words specifically linked to that e-book.

For this, we are going to use a few of the convenient features of web5ync, namely: Master/details data binding, and Tabs. (I will write more about these features in a future post)

Web5ync master/details binding allows linking a subset of data to a selected item in the master section. Basically, each data section is an iframe. The event of selecting a data row in one iframe can update the document source of one or more iframes. All what we need is: 1. define a column that will be used as the row's id, and 2. define how the value of that id should be passed to the target iframe (typically: url parameter name).

Tabs are convenient in our case as they will allow distributing the information in several areas while optimizing web page space usage.

In the figure above, we have 3 main data tabs: n Explore by words, n Document info and n Selected document words.

On the first tab:

  • clicking a word (in the upper iframe) should display the list of its related e-books (in lower iframe of that same tab)
  • clicking an e-book row on the lower iframe should: first displays its details (an iframe on the second tab), and display all words directly linked to the selected e-book (an iframe in the 3rd tab). (figures below)

In that last tab, we can play once again with the displayed words, to show other documents sharing one of them:

doc5ync–Trie database integration process

I continue here the excursion around using the Trie pattern and structures to index e-book words for the doc5ync project.

If you missed the beginning of the story, you can find it Here, Here and Here

The role of the client integration tool (a WPF app) is to pull e-books information to be indexed from the database, proceed to indexing the words and creating the links between each word and its related e-book. This is done using some settings: the language to index, the minimum number of chars to consider a sequence as a ‘word’… etc.

trie-with-data-db-integration-process

The integration process flow is quite simple:

  • Once we are happy with the obtained results, we use the tool to push the trie to the database in a staging table.
  • A database stored procedure can then extract the staging data into the tables used for presenting the index on the project web page.

trie-web-page

The staging table has a few fields:

  • The word string
  • The related e-book ID (relationship => docs table (e-books))
  • The number of occurrences of the word
  • The timestamp of the last insertion

The only difficulty encountered was the number of records (often tens of thousands) to push to the staging table. The (artisanal!) solution was to concatenated values of  blocks of records to be inserted (I.e.:  ‘insert into table(field1, field2, …) values ( v1, v2, …), (v3, v4, …), …’ etc.). Sending 150 records per command seemed to be a sustainable choice.

The staging table data is to be dispatched into two production tables:

  • doc5_trie_words:
    • word ID
    • language ID
    • word string
    • word’s number of occurrences
    • comments

 

  • doc5_trie_word_docs:
    • word ID (relationship => the above table)
    • e-book ID (relationship => docs (e-books) table)

 

Once the data is in the staging table, the work of the stored procedure is quite straightforward:

  • Delete the current words table (which cascade deletes the words / docs reference records)
  • Import the staging word (strings and occurrences) records into doc5_trie_words
  • Import the related word / doc IDs into doc5_trie_word_docs.

Many words are common between languages and e-books. Therefore assigning a language to a word has no sense unless all its related documents are from one specific language. That is the additional and final task of the stored proc.

Next step: the index web page presentation!

That will be the subject of the next post!

TFS & SQL Server 2008R2 msxmlsql issues

I recently went through another new issue with TFS:

I had the TFS database and Data warehouse both installed on a SQL Server 2008 instance.

After installing a new SQL Server 2008R2 instance, TFS just stopped working complaining about database access.

It took me a while to re-discover that the source of annoyance was – again - the msxmlsql.dll and msxmlsql.rll files located at:

\Program Files\Microsoft SQL Server\100\Shared

And

\Program Files\Microsoft SQL Server\100\Shared\Resources\1033

respectively.

 

For some reason (seems useless to spend time searching J) the versions to use are with TFS are:

msxmlsql.dll (1,311,256 bytes); 07/10/2008; file version: 2007.100.1600.22

msxmlsql.rll (52 248 bytes); 07/10/2008

 

The files that caused my problem:

msxmlsql.dll (1 308 000 bytes bytes); 04/03/2010; file version: 2009.100.1600.1

msxmlsql.rll (48 992 bytes); 04/03/2010

 

 

 

 

Another TFS pause: moving your stuff

As I already said (I should not be the only one to sayJ): TFS is a precious tool for project management. The problem with TFS is the tedious processes to install and troubleshoot specific issues.

This has greatly been changed in TFS 2010 to become much less tedious (thanks to the TFS team and to Brian Harry for all the useful information) but IT IS, still, quite tedious!

For instance: One of the usual tasks we do from time in the real world is to move our stuff from somewhere to somewhere else!

Don't expect this to be as easy as you may think in TFSJ

 

The context

Let's say we have a whole TFS installation somewhere (configuration / collections databases… etc.) and we want (or need) to move everything to another machine.

At a first glance, that should be straightforward:

  • Install TFS (and SQL server) on the new machine;
  • Attach (or restore) TFS databases to the new machine;
  • Tell the new TFS where to find the databases;
  • Done!

 

Yes, it is a good path… but somewhat tortuous!

 

The recipe

On the new machine:

SQL configuration and settings

  • Install SQL Server (2008 or 2008R2). Note the following elementary observations:
    • During installation, you should select a CI AS (case insensitive / accent sensitive) collation (either Windows or SQL);
    • Prefer a 'default instance' installation
  • Attach (or restore) all TFS databases (configuration database, collections and datawarehouse databases.
  • Install latest SQL server updates and Service Packs.

 

TFS configuration and settings

  • Install TFS
  • DON'T click Configure at the end of the install process
  • Launch TFS admin console
    • Click Application Tier node
    • Click Configure Installed Features
    • Select the 'Application-Tier only' Wizard
    • Click List Available Databases.
      • This will display detected TFS configuration databases on the SQL server. If none is detected, then you have to go back to SQL configuration and see what is missing
    • Select the configuration database
    • Select the account service
    • Check and correct errors and warnings. You may need to go back to SQL configuration steps.
    • Click Configure

 

Problems encountered?

If you encounter problems (this is often the case with TFSJ), you may need to uninstall and restart the Application-Tier configuration:

  • Open a command prompt (Run As Administrator)
  • Go to TFS tools directory (%program files%\Microsoft Team Foundation Server 2010\Tools)
  • Uninstall the Application Tier: Type the following command:

tfsconfig setup /uninstall:ApplicationTier

  • Restart the above TFS configuration steps at Launch TFS admin console

 

Security configuration issues

Original SQL server configuration, TFS and various service accounts may not match the new machine's configuration. This may generate some errors that can be corrected using SQL server management studio and, in case, TF admin console (Administer security option). It is sometimes necessary to go through TFSSecurity command line utility.

 

Please note: all this should be considered as "useful hints" to solve TFS migration from machine to machine. You should be armed with patience and try find other simple and reliable paths for your specific problems.

Anyway, Hope this may help!

A dive into the undocumented TFS meta-model – Part II

 

In a previous post, I started exploring the TFS configuration database. Let's continue here by having a look at some more configuration objects and relationships.

TFS Config security objects

 

Referenced table

Primary column

Table

Foreign column

tbl_security_identity_cache

tf_id

tbl_gss_group_membership

member_id

tbl_security_identity_cache

tf_id

tbl_security_domain_groups

group_id

tbl_security_identity_cache

tf_id

tbl_security_membership_cache

container_id

tbl_security_identity_cache

tf_id

tbl_security_membership_cache

member_id

tbl_gss_groups

tf_id

tbl_gss_group_membership

parent_group_id

tbl_security_domain

domain_number

tbl_security_domain_groups

domain_number

tbl_security_domain

domain_number

tbl_security_projects

domain_number

tbl_security_identity_type

type_id (int)

tbl_security_identity_cache

type (tinyint)

The relationship diagram above suggests that we can query user logins, group-membership for specific projects by a query like the following:


SELECT TOP (100) PERCENT
    login.display_name  AS user_name,
    proj.scope_name     AS project,
    login_grp.display_name AS user_group

FROM dbo.tbl_security_membership_cache AS membership INNER JOIN
  dbo.tbl_security_identity_cache AS
login ON membership.member_id = login.tf_id INNER JOIN
  dbo.tbl_security_projects AS proj INNER JOIN
     dbo.tbl_security_domain AS domain INNER JOIN
       dbo.tbl_security_domain_groups AS grp ON domain.domain_number = grp.domain_number
       ON proj.domain_number = domain.domain_number
     ON
membership.container_id = grp.group_id INNER JOIN
  dbo.tbl_security_identity_cache AS
login_grp ON grp.group_id = login_grp.tf_id

ORDER
BY user_name, project

 

The above query may give us results similar to the following

user_name

project

user_group

Administrator

Research education "agile

Administrators

Administrator

Research education "agile

[TEAM FOUNDATION]\Team Foundation Service Accounts

Administrator

SourceSafeProjects

[TEAM FOUNDATION]\Team Foundation Service Accounts

Administrator

SourceSafeProjects

Administrators

Administrator

DefaultCollection

[TEAM FOUNDATION]\Team Foundation Service Accounts

Administrator

DefaultCollection

Administrators

 

TFS database pause: Change set quantitative statistics sample

Where to find change-sets information?


TFS source control maintains change-sets for projects in the corresponding collection database. Each change-set is of course composed of the modified project files at the time of the Check-in.

Change-set information is stored in the tbl_ChangeSet table of the project collection database. That is: if you have a project collection 'Default Collection' stored in a database named tfs_default_collection, you will find change-sets information in the table tfs_default_collection.dbo.tbl_ChangeSet.

Files information is stored in tbl_files table, and file-versions information in tbl_Version table.

So, to obtain the list of files of a change-set number 861, for instance, you can execute a query similar to the following:


SELECT TOP (100) percent
    chg_set.CreationDate,
    chg_set.ChangeSetId,
    v.FullPath

FROM
    dbo.tbl_ChangeSet (nolock)AS chg_set INNER JOIN
    
dbo.
tbl_Version (nolock)AS v ON chg_set.ChangeSetId = v.VersionFrom LEFT OUTER JOIN
    
dbo.
tbl_File (nolock) AS f ON v.FileId = f.FileId


WHERE (chg_set.ChangeSetId = 861)

ORDER BY chg_set.CreationDate, v.FullPath

 

 

Which may produce results like the following:

CreationDate

ChangeSetId

FullPath

2012-05-05 20:22:31.800

861

$\Research education "agile\SimpleUniConverter\commands\UniConvertCommandModel.cs\

2012-05-05 20:22:31.800

861

$\Research education "agile\SimpleUniConverter\Convert.xaml.cs\

2012-05-05 20:22:31.800

861

$\Research education "agile\SimpleUniConverter\Convert.xaml\

2012-05-05 20:22:31.800

861

$\Research education "agile\SimpleUniConverter\MainWindow.xaml.cs\

2012-05-05 20:22:31.800

861

$\Research education "agile\SimpleUniConverter\MainWindow.xaml\

 

How many bytes changed?


As you may have noticed from the figure above, change-set file information contains a FileLength column. This suggests that we can obtain the number of changed bytes of a given file between two change-sets… through the following query results we can know that the file grew 696-551 = 145 bytes:

CreationDate

ChangeSetId

FullPath

FileLength

2012-04-30 22:33:10.040

843

$\Research education "agile\SimpleUniConverter\Convert.xaml.cs\

551

2012-05-05 20:22:31.800

861

$\Research education "agile\SimpleUniConverter\Convert.xaml.cs\

696

 

Who changed what?


The change-set table provides an CommitterId column through which we can know who made a given check-in. As this column contains an ID (integer refering tbl_identity.IdentityId), this is not straight forward information. We have to join other tables to obtain something useful.

In fact, the committer user name is stored in the Constants table. So, we have to join the Change-set's CommitterId column to tbl_identity.IdentityId, and then join tbl_identity.TeamFoundationId to Constants.TeamFoundationId… we can then get the user name in the Constants.DisplayPart column (or Constants.String column)

 

 

This can be presented by the following change in our initial view diagram:

 

CreationDate

ChangeSetId

FullPath

FileLength

DisplayPart

2012-04-30 22:33:10.040

843

$\Research education "agile\SimpleUniConverter\Convert.xaml.cs\

551

Taoffi

2012-05-05 20:22:31.800

861

$\Research education "agile\SimpleUniConverter\Convert.xaml.cs\

696

Taoffi

 

Changed bytes tracking generalization


It would, ideally, be beneficial to integrate such information into the TFS analytical database.

We still can do some turnaround by directly querying the database tables.

The idea is to list all change-set files in the order of their respective check-in date. To handle the probability of identical dates / files, we can use a row number to distinguish changes. This may be done like in the following view:


CREATE VIEW [dbo].[xtfs_changeset_files_qry]
AS
SELECT
TOP (2147483647)
    ROW_NUMBER() over( ORDER BY chg_set.CreationDate) as row_number,
    chg_set.CreationDate,

    const.DisplayPart        as user_name,
    chg_set.ChangeSetId,

    v.FullPath,
    f.FileLength

FROM dbo.tbl_ChangeSet (nolock)    AS chg_set INNER JOIN
dbo.tbl_Identity (nolock)    AS login ON chg_set.CommitterId = login.IdentityId INNER JOIN
dbo.tbl_Version (nolock)    AS v ON chg_set.ChangeSetId = v.VersionFrom LEFT OUTER JOIN
dbo.tbl_File (nolock)        AS f ON v.FileId = f.FileId inner join
Constants (nolock)        AS const ON const.TeamFoundationId=login.TeamFoundationId

ORDER
BY chg_set.
CreationDate, v.FullPath


GO

 

We can then refer the view to itself (current version / previous version) obtain row-relative quantitative changes:

SELECT v1.row_number,
       v1.CreationDate,
       v1.user_name,
       v1.FullPath AS v1FilePath,
       v1.FileLength AS v1_length,
       v_next.FileLength AS v2_length,
       v_next.FullPath AS v2FilePath,
       ISNULL(v_next.FileLength, 0)
          - ISNULL(v1.FileLength, 0) AS changed_bytes

FROM dbo.xtfs_changeset_files_qry AS v_next RIGHT OUTER JOIN
     dbo.xtfs_changeset_files_qry AS v1 ON v_next.FullPath = v1.FullPath
        AND v_next.row_number = v1.row_number + 1
        OR v_next.FullPath IS NULL
        AND v_next.row_number IS NULL

 

A dive into the undocumented TFS meta-model – Part I

TFS… introduction to this series


Microsoft TFS (Team Foundation Server… recently renamed to ALM: Application Lifecycle Management) is a great project management platform.

TFS is built on meta-model abstractions (that were first seen, in Microsoft's products, in SharePoint).

TFS, like SharePoint, is composed of several software components which operate around several databases: the configuration database and one or more content databases.

Configuration and content databases' elements (tables, views, functions, stored procedures… etc.) represent the meta-models (schemas and business logic for each related feature) which govern and control much of the software components' behaviors.

TFS is often presented as a software project management solution. In my view, this seems a little reductive of its value. Because TFS features truly span a much larger project management scope.

Many useful literatures have been written about TFS: its features, usage, guidance, troubleshooting… etc. Few have enlightened its meta-model structures and, as of my knowledge, none has put some clear light about its (undocumented) databases' structures. That is what I will try to do in this series, more to value TFS abstractions than to nude its internal mechanics. And also to enforce, again, the meta-model approach as a rational basis for software solutions.

 

TFS logical architecture

 

For its contents, TFS maintains a database per project-collection.

Each collection may contain one or more projects. And each project can be managed according to a selected methodology (Agile / Scrum / CMMI… etc.). In fact, TFS applies and operates the selected methodology through the provided methodology project management template. Ultimately, this means you can build your own methodology and use it to manage your project with TFS. All what you need is to provide the desired methodology template.

Managing a project with TFS consists of recording and following-up specific Work items (defined by the methodology template)… those can be items like: Tasks, Bugs, Change requests, User story… etc.

As TFS was initially brewed to manage software projects, a common item in each project is the Source control repository which helps developers to maintain their source code versions and branches in a very efficient architecture composed of 'change sets'.

 

 

To discover and figure out its components' data and their required behaviors, TFS uses a configuration database. A first look at the configuration database will be the subject of this first article in the series.

 

The configuration database


Like SharePoint, TFS databases are delivered without documentation or relational diagrams. They simply don't contain any explicit Foreign-keys or relationships. Whatever the reason behind this choice, that seems a little surprising of a product whose main aim is projects' transparency and documentationJ.

That said, to discover, understand and correctly interpret the role of each TFS database item and their inter-relationships you should do a deep dive and be armed with some patience and empiric methods!

 

Hosts and processes


Let's have a look at three important tables:

  • tbl_ServiceHost contains the list of services' host information. This includes the item's name, the database connection string (when relevant), the Virtual directory and Resource directory…
  • tbl_ServiceHostProcess contains information about current processes. Machine name, Process name, Identity under which the process is running, start time…
  • tbl_ServiceHostInstance contains the list of processes' instances per host (Process Id / Host Id / Start time).

 

In my case, the table tbl_ServiceHost contained 3 entries for:

  • Team Foundation: root item (parentless… see diagram ParentHostId column). Connected to the configuration database.
  • Two child entries each for one of my TFS Project collections:
    • Default collection.
    • Training collection.

 

The above relational schema allows us to query the tables and obtain results like the following about current running services:


SELECT TOP (100) PERCENT
svcHost.Name         AS host,
svcHost.VirtualDirectory,
svcHost.ConnectionString,
process.MachineName    AS machine,
process.ProcessName,
process.StartTime


FROM dbo.tbl_ServiceHostInstance AS instance INNER JOIN
dbo.tbl_ServiceHost AS svcHost ON instance.HostId = svcHost.HostId LEFT OUTER JOIN
dbo.tbl_ServiceHostProcess AS process ON instance.ProcessId = process.ProcessId LEFT OUTER JOIN
dbo.tbl_ServiceHost AS hostParent ON svcHost.ParentHostId = hostParent.HostId

ORDER BY process.StartTime DESC, hostParent.Name, host

 

Sample results

host 

VirtualDirectory 

ConnectionString 

machine

ProcessName 

StartTime 

TEAM FOUNDATION 

NULL 

Data Source= hp10\SqlTfs;Initial Catalog=Tfs_Configuration;Integrated Security=True

HP10 

w3wp.exe 

2012-05-21 20:18:09.997 

DefaultCollection 

~/DefaultCollection/ 

Data Source=hp10\sqltfs;Initial Catalog=tfs_defaultcollection;Integrated Security=True 

HP10 

w3wp.exe 

2012-05-21 20:18:09.997 

Training collection 

~/Training collection/ 

Data Source=hp10\SqlTfs;Initial Catalog="Tfs_Training collection";Integrated Security=True

HP10 

w3wp.exe 

2012-05-21 20:18:09.997 

TEAM FOUNDATION 

NULL 

Data Source=hp10\SqlTfs;Initial Catalog=Tfs_Configuration;Integrated Security=True 

HP10 

TfsJobAgent.exe 

2012-05-21 06:51:37.870 

DefaultCollection 

~/DefaultCollection/ 

Data Source=hp10\sqltfs;Initial Catalog=tfs_defaultcollection;Integrated Security=True 

HP10 

TfsJobAgent.exe 

2012-05-21 06:51:37.870 

Training collection 

~/Training collection/ 

Data Source=hp10\SqlTfs;Initial Catalog="Tfs_Training collection";Integrated Security=True

HP10 

TfsJobAgent.exe 

2012-05-21 06:51:37.870 

 

Ressources and services

Again, let's follow up the following tables:

  • tbl_CatalogResourceType stores a list of resource types. Like: 'Machine', 'Infrastructure Root', 'Project Server', 'Team Foundation Project Collection Database'… etc. Here is some sample entries:

Identifier 

DisplayName 

Description 

9FB288AE-9D94-40CB-B5E7-0EFC3FE3599F 

Default location for team project portals 

By default, each team project Web site will be a sub-site of this location.

0584A4A2-475B-460E-A7AC-10C28951518F 

Machine 

A machine that exists as part of the TFS deployment. 

14F04669-6779-42D5-8975-184B93650C83 

Infrastructure Root 

The root of the catalog tree that describes the physical makeup of the TFS deployment.

1B6B5931-69F6-4C53-90A0-220B177353B7 

Team Foundation Project Collection Database 

The database that houses the information for a Team Foundation Project Collection. 

526301DE-F821-48C8-ABBD-3430DC7946D3 

Team Foundation Configuration Database

The database that houses the information for a Team Foundation Application Instance. 

289DD275-CECA-4698-8042-38D2E86FC682 

Project Server 

Project Server Information 

3DADD190-40E6-4FC1-A306-4906713C87CE 

SharePoint Web Application 

A SharePoint Web application that has been configured to support team project portals.

3C856555-8737-48B6-8B61-4B24DB7FEB15 

Test Controller 

This is a test controller type in Catalog Service. 

26338D9E-D437-44AA-91F2-55880A328B54 

Team Project Collection 

A Team Project Collection that exists within the TFS deployment.

450901B6-B528-4863-9876-5BD3927DF467 

Project Portal 

A Web site or SharePoint site that provides a portal for a team project. 

15DA1594-45F5-47D4-AE52-78F16E67EB1E 

Process Guidance 

A Web site or SharePoint library that provides process guidance for a team project.

47FA57A4-8157-4FB5-9A64-A7A4954BD284 

Team Web Access 

Team Web Access Location 

48577A4A-801E-412C-B8AE-CF7EF3529616 

Team Project 

A Team Project that exists within the TFS deployment. 

 

  • tbl_CatalogResource stores resources (each referring to one of the types in the above tbl_CatalogResourceType table. Sample entries:

 

Identifier 

ResourceType 

DisplayName 

Description 

7462A882-741B-4F91-885F-1405A22256DF 

1B6B5931-69F6-4C53-90A0-220B177353B7 

Tfs_Training collection

 

C5CD36C5-8373-4BF7-B3FA-16199DAFF106 

48577A4A-801E-412C-B8AE-CF7EF3529616 

cmmi project 

cmmi training project 

16480162-C502-4197-872A-4DDDFE49BA90 

1B6B5931-69F6-4C53-90A0-220B177353B7 

tfs_defaultcollection 

 

DCC270E7-6EC7-45BF-A827-53BBB5386606

14F04669-6779-42D5-8975-184B93650C83 

Infrastructure Root 

The root of the catalog tree that describes the physical makeup of the TFS deployment. 

2F8C7D1C-EB1F-4058-B34D-60FA2F6CA7DC 

48577A4A-801E-412C-B8AE-CF7EF3529616 

Agile project 

 

D7121DA7-3E32-49C3-BCB1-68C83C4383B7

47FA57A4-8157-4FB5-9A64-A7A4954BD284 

Team Web Access 

 

4B812279-A7A6-4A8A-92A4-7AA208589FEB 

EB1E0B3B-FAA1-49D2-931A-FDC373682BA5 

SQLTFS 

 

F69EFD2B-51C6-461B-9E49-8089DF86C9A3 

0584A4A2-475B-460E-A7AC-10C28951518F 

hp10 

 

5242DBFC-06C8-4CAD-A27B-99212BEEE999

526301DE-F821-48C8-ABBD-3430DC7946D3 

Tfs_Configuration 

 

713FF18A-E0E8-4192-A7B9-A0900F50A5BC 

26338D9E-D437-44AA-91F2-55880A328B54 

DefaultCollection 

 

 

  • tbl_ServiceDefinition stores a list of predefined services (classes) and the web access page for each. Sample entries (among other things, you may note curly braces which are place holders for some url parameters in the RelativePath column):

 

ServiceType 

Identifier 

DisplayName 

RelativePath 

AdministrationService 

C18D6E34-68E8-40D2-A619-E7477558976E

Administration Service 

/TeamFoundation/Administration/v3.0/AdministrationService.asmx 

Annotate 

74B15E02-0AC2-414F-A9B9-30268659D3B5 

Team Web Access (Annotate) 

/web/ann.aspx 

AnnotateSourceControlItem

D271E722-C261-4BC2-B0F7-1C8A9E13F907 

Team Web Access (AnnotateSourceControlItem) 

/web/ann.aspx?pcguid={projectCollectionGuid}&path={itemPath}&cs={itemChangeset} 

CatalogService 

C2F9106F-127A-45B7-B0A3-E0AD8239A2A7 

Catalog Service 

/TeamFoundation/Administration/v3.0/CatalogService.asmx

ChangesetDetail 

D40EF625-CCA7-4E73-B9EC-86CBE1534CE0 

Team Web Access (ChangesetDetail) 

/web/cs.aspx 

CreateWorkItem 

14CD69C6-88F9-4C8C-A259-D2441D77D1AF

Team Web Access (CreateWorkItem) 

/web/wi.aspx?puri={projectUri}&wit={workItemType} 

Difference 

2B84D900-1F08-486C-9C47-0E6AF371D03C 

Team Web Access (Difference) 

/web/diff.aspx 

Eventing 

C424AE04-8C6F-4516-8B2D-238FFFCA3081 

Event Service 

/TeamFoundation/Administration/v3.0/EventService.asmx

IdentityManagementService 

3DE26348-00BE-4B82-8E4A-E5AD004CFECD 

Identity Management Service 

/TeamFoundation/Administration/v3.0/IdentityManagementService.asmx 

JobService 

DA1C0184-14FE-4E13-B7FC-6EAA07D84BE8 

TFS Background Job Service 

/TeamFoundation/Administration/v3.0/JobService.asmx 

 

  • tbl_CatalogServiceReference stores per-resource services' entries. (Resource Identifier / Service Identifier…).

 

ResourceIdentifier 

AssociationKey 

ServiceIdentifier 

ServiceType 

D7121DA7-3E32-49C3-BCB1-68C83C4383B7 

ChangesetDetail 

D40EF625-CCA7-4E73-B9EC-86CBE1534CE0

ChangesetDetail 

D7121DA7-3E32-49C3-BCB1-68C83C4383B7 

CreateWorkItem 

14CD69C6-88F9-4C8C-A259-D2441D77D1AF 

CreateWorkItem 

D7121DA7-3E32-49C3-BCB1-68C83C4383B7 

Difference 

2B84D900-1F08-486C-9C47-0E6AF371D03C 

Difference 

D7121DA7-3E32-49C3-BCB1-68C83C4383B7

DiffSourceControlItems 

5E91C4DA-0013-4EBB-943D-CC77F5ADB82D 

DiffSourceControlItems 

D7121DA7-3E32-49C3-BCB1-68C83C4383B7 

DiffSourceControlShelvedItem 

4C81A44D-67AB-4D23-9CBE-339C9102993B 

DiffSourceControlShelvedItem 

D7121DA7-3E32-49C3-BCB1-68C83C4383B7 

ExploreSourceControlPath

AC0770BC-1DD6-4B8E-A811-5A03690DF44F 

ExploreSourceControlPath 

D7121DA7-3E32-49C3-BCB1-68C83C4383B7 

Home 

0F9CED5D-89F9-4743-BAB8-FA511FF09A8C 

TSWAHome 

 

This allows us to query the services in a way similar to the following:

SELECT resType.DisplayName AS ResourceType,
svc.DisplayName AS Service,
svc.Description AS serviceDescription
FROM dbo.tbl_ServiceDefinition AS svc INNER JOIN
dbo.tbl_CatalogServiceReference AS svcRef ON svc.Identifier = svcRef.ServiceIdentifier
INNER JOIN
dbo.tbl_CatalogResource AS Res ON svcRef.ResourceIdentifier = Res.Identifier
INNER JOIN
dbo.tbl_CatalogResourceType AS resType ON Res.ResourceType = resType.Identifier

 

 

Sample results:

ResourceType

Service

serviceDescription

Team Web Access

Team Web Access (Annotate)

 

Team Web Access

Team Web Access (AnnotateSourceControlItem)

 

Team Web Access

Team Web Access (ChangesetDetail)

 

Team Web Access

Team Web Access (CreateWorkItem)

 

Team Web Access

Team Web Access (Difference)

 

Team Web Access

Team Web Access (DiffSourceControlItems)

 

Team Web Access

Team Web Access (DiffSourceControlShelvedItem)

 

Team Web Access

Team Web Access (DiffSourceControlShelvedItem)

 

Team Web Access

Team Web Access (ExploreSourceControlPath)

 

Team Project Collection

Location Service

Location Service for Visual Studio Team Foundation Server.

Team Project Collection

Location Service

Location Service for Visual Studio Team Foundation Server.

Team Foundation Server Instance

Location Service

Location Service for Visual Studio Team Foundation Server.

Team Web Access

Team Web Access (OpenWorkItem)

 

 

More about TFS meta-models in following posts!

Sorting dates: Ascending, Descending OR…

This is a small story, yet with something special:

Once upon a time, there was a list of events stored into a database table. Each event had (evidenceJ) a date.

A friend of mine asked me to sort these event items.

I asked, as you may imagine: Ascending or Descending?

 

But this, apparently, seemed too simplistic for him…

- I want the most recent to appear first. He said.

- Well, that means you want to sort in descending order.

I gave him a demo of data sorted in descending order…

- No, he said, that is not what I want!

 

After a short discussion, I understood that what he wanted was:

·         To first have the incoming events (i.e. those located after “today’s” date… sorted in an ascending order)

·         Followed by older dates (sorted in ascending order)

 

I first thought: can ORDER BY do something else than sorting in ascending / descending?...

As the answer was, evidently, No… I said: Sorry, but that is not possible!

The question remained in my mind however. And I started looking for a solution (for myself!)

The experimentations didn’t in fact last long, the solution was quickly found:

 

All what we need is to:

§  Categorize dates according to their location relative to Today (or any other logic);

§  Give each category a ‘sortable’ value (integer for example)

§  Sort by the category’s value, followed by the date value.

 

Example:

SELECT TOP (100) PERCENT

      event_date,

      description,

      CASE WHEN event_date >= getutcdate() THEN 1

            ELSE 2

      END               AS date_category  -- categorize date

FROM         dbo.sight_conferences

ORDER BY date_category, event_date

 

 

Screenshot of the output data sample

 

 

 

The key part of this code is the logic of categorizing dates. In this sample, categorization was quite simple (set a category according to date location relative to Today). But, of course, we can use another logic scheme to obtain data sorted accordingly.

Managing databases periodic membership (or: unique time span indexing) Part II

Ho to view periodic relationship data

In the previous post, I exposed a method for uniquely indexing time spans of periodic-memberships.

 

Here, we will discuss the next task: How to view historical data in such structures?

To illustrate this, we will talk about a portfolio sample data, simply because this is a more elaborate case (than the hotel/room reservation case).

 

Suppose we have:

§  A portfolio named ‘Sample portfolio’

§  A set of companies: company 1, company 2 and company 3

§  A set of daily historical data for each company for the period from 2010/01/01 to 2010/01/10

 

Now suppose that:

§  Company 1 and company 3 are ‘permanent members’ of the portfolio starting at 2010/01/01

§  Company 2 is a member of the portfolio for the following periods:

o   2010/01/01 to 2010/01/05

o   2010/01/08 to à endless

 

To query the daily historical data for company 2 related to its parent portfolio, we should thus obtain the data for the periods where it is effectively a member of the portfolio:

§  2010/01/01, 2010/01/02, 2010/01/03, 2010/01/04, 2010/01/05

§  2010/01/08, 2010/01/09 and 2010/01/10

 

(Note that data for 2010/01/06 and 2010/01/07 should be ignored)

 

This can be done using a view similar to the following (TSQL followed by the view design diagram)

 

SELECT     p.name AS portfolio,

                c.name    AS company,

                history.date,

                history.data

 

FROM dbo.portfolio_companies AS pc INNER JOIN

                      dbo.portfolios AS p ON pc.portfolio_id = p.id INNER JOIN

                      dbo.company_data AS history

                ON pc.date_start <= history.date

                               AND ISNULL(pc.date_end, CONVERT(datetime, '9999/01/01')) >= history.date INNER JOIN

                      dbo.companies AS c ON history.company_id = c.id AND pc.company_id = c.id

 

As said above, company 2 is a portfolio member during the following periods:

§  2010/01/01 to 2010/01/05

§  2010/01/08 to à endless

 

 

 

With our sample history data, this view produces the following data for company 2 (8 entries):

 

 

Note: the history data table contains the following data for company 2 (10 entries):

 

 

Managing databases periodic membership (or: unique time span indexing)

Introduction

The relations between some ‘real world’ objects can sometimes be in the form of ‘periodic membership’. By ‘periodic membership’, I mean the context when one object can be member of another during one or more periods of time.

Examples:

·         A person who occupies a hotel room during a period of time;

·         A company which is member of a portfolio during a period of time;

·         … etc.

 

The context also assumes that we want to keep the membership history and be able to build data aggregations that take into account these periodic memberships. (For example: be able to calculate portfolio performance during a period of time that spans several ‘periodic memberships’).

 

A concrete example may better explain the case:

§  We have a portfolio for which we want to maintain historical performance during time;

§  At 2009/01/01, the portfolio contained : company1, company2 and company3;

§  At 2009/03/01, we changed the portfolio components and excluded company2. (Now the portfolio contains only company1 and company3);

§  At 2009/06/01, to calculate the portfolio performance for the period 2009/01/01 to 2009/06/01, we should take company2 into account for the period 2009/01/01 to 2009/03/01 (although it is no more part of the portfolio).

 

Knowing that database engines do not offer a ‘time-overlap’ unique indexing features, resolving this situation needs a structure that allows us to assign and manage periodic membership indexes.

Note: I am surely not the first to confront this context, but I wanted to share my experience to resolve this problem and will be glad to receive any thoughts about it.

 

Periodic membership multiplicity models

In my experience, I encountered two possible types of multiplicities related to periodic relationship:

§  One-to-one periodic membership: a good example for this would be a hotel room that can be occupied only by one person during the same period (i.e. period is unique per room);

§  One-to-many periodic membership: a good example for this would be a portfolio which can have more than one member for the same period (i.e. period is unique per portfolio-member).

In this article, I will explain how to handle both of these types.

 

Hotel room reservation, the basic structure

To demonstrate a hotel reservation periodic membership (uniqueness of room/reservation period), we can create the following table and relationships:

 

§  Hotel clients: contains client list items

§  Hotel rooms: contains room list

§  Hotel room reservations: contains client/room reservation periods

 

Note: the date_end field of the reservations table can be null, to allow ‘permanent’ (or endless) reservations.

 

To ensure uniqueness of reservation periods, we will create a scalar function (returning a Boolean value) that can be used as Check Constraint on the hotel_room_reservations.

 

CREATE FUNCTION [dbo].[is_valid_hotel_reservation_period]

(

       @item_id     int,

       @room_id     int,

       @date_start  datetime,

       @date_end    datetime

)

RETURNS bit

AS

BEGIN

       -- validate the provided parameters    

       if @date_start is null or @room_id is null

             return 0

 

       -- be sure start date < end date 

       if not @date_end is null

       begin

             if @date_start >= @date_end

                    return 0

       end

      

       -- is this the first reservation period for this room?

       if ( select count(*)

                    from dbo.hotel_room_reservations

                    where( room_id = @room_id) ) <= 0

             return 1

      

       -- check overlapped periods

       if @date_end is null              -- is this a permanent (endless) reservattion?

       begin

             if ( select  count(*)

                    from dbo.hotel_room_reservations

                    where( ( room_id = @room_id)

                                  and( id != @item_id)

                                  and(( date_end is null) or (date_end >= @date_start))

                           )) > 0

                    return 0

       end

       else

       begin

             if ( select  count(*)

                    from dbo.hotel_room_reservations

                    where(       (room_id = @room_id)

                                  and( id != @item_id)

                                  and( date_start <= @date_end)

                                  and(( date_end is null) or (date_end >= @date_start))

                           )) > 0

                    return 0

       end

      

 

       -- return OK

       return 1

END

 

We can now use this function as check constraint on the reservation table:

 

 

Overlapping entries in the reservation table would now display an error message like:

The Portfolio/company periodic membership case

In portfolio/company case, membership should allow multiple membership instances for different periods. That is: the same company can be member of the same portfolio several times as long as the membership periods do NOT overlap.

 

Our tables and relationships may look like the following:

 

All what we need is to create a new version of the above function, and use it as a check constraint condition for the portfolio companies table:

 

CREATE FUNCTION [dbo].[is_valid_portfolio_company_period]

(

       @item_id     int,

       @portfolio_id int,

       @company_id  int,

       @date_start  datetime,

       @date_end    datetime

)

RETURNS bit

AS

BEGIN

       -- validate the provided parameters    

       if @company_id is null or @date_start is null or @portfolio_id is null

             return 0

 

       -- be sure start date < end date 

       if not @date_end is null

       begin

             if @date_start >= @date_end

                    return 0

       end

      

       -- is this the first membership period for the portfolio/company?

       if ( select count(*)

                    from dbo.portfolio_companies

                    where( portfolio_id = @portfolio_id)

                           and( company_id = @company_id) ) <= 0

             return 1

      

       -- check overlapped periods

       if @date_end is null              -- is this a permanent (endless) membership?

       begin

             if ( select  count(*)

                    from dbo.portfolio_companies

                    where( company_id = @company_id

                           and( portfolio_id = @portfolio_id)

                           and( id != @item_id)

                           and(( date_end is null) or (date_end >= @date_start))

                           )) > 0

                    return 0

       end

       else

       begin

             if ( select  count(*)

                    from dbo.portfolio_companies

                    where( company_id = @company_id

                           and( portfolio_id = @portfolio_id)

                           and( id != @item_id)

                           and( date_start <= @date_end)

                           and(( date_end is null) or (date_end >= @date_start))

                           )) > 0

                    return 0

       end

      

       return 1

END

 

We can, now, use the function as a check constraint condition for the membership table:

 

 

Possible extensions

It seems that, with some more efforts, this can evolve to a generic Time-Span unique indexer.

Waiting to hear some thoughts about thisJ!