Which one is faster? Inner Join or Function

Sheik 2015-09-09 07:06:55

I have a table which has the Employment Data. And there are some supporting masters like Employee, Department and Location. In Employee Data I have Employee Code, Department Code and Location Code. I need a report with Employee Code, Name, Department Code, Department Desc, Location Code and Location Desc.
Is it good to write Inner join with all the 4 table or Write a Select Statement from Employment Data table with function calling from all other table
Option 1 :
Select ED.ECode, EM.EName, ED.DCode, DM.DName, ED.LCode, LM.LName
FROM Employment_Data ED,
Employee EM,
Department DM,
Location LM
AND ED.DCode = DM.DCode
AND ED.LCode = LM.LCode

Option 2 :
Select ED.ECode, dbo.Emp_Name(ED.ECode),
ED.DCode, dbo.Dep_Name(ED.DCode),
ED.LCode, dbo.Loc_Name(ED.LCode)
FROM Employment_Data ED

Which one is faster?

SQLSentrySDyckes 2015-09-09 17:08:12
Which one is faster would depend upon the dataset you are working with coupled with the indexes on the tables. The larger the dataset being analyzed, the more I would lean toward Option 1 being faster. When using a function in the SELECT, you will be processing the data row by row (RBAR), which does not scale very well.

For a specific answer to your question, you will need to test both methods against your current dataset. I would also advise creating a larger test dataset to test how each scale as the dataset grows.

I would also recommend writing your JOINs in an easier to read format specifying the INNER JOIN and the ON:

FROM Employment_Data ED
INNER JOIN Employee EM ON ED.ECode = EM.ECode
INNER JOIN Department DM ON ED.DCode = DM.DCode
INNER JOIN Location LM ON ED.LCode = LM.LCode

Sheik 2015-09-10 07:14:51
Got it clear. Thank you.
SQLkiwi 2015-09-10 19:22:06
I'm going to get off the fence and say you should always use a join for this. Functions in SQL Server do not behave like functions in other languages.

The proposed syntax in the question suggests you are thinking of implementing this using scalar functions that perform data access. This is a pattern you should avoid at all costs. SQL Server will end up running a separate query (to simplify slightly) for each function, for each row. As you might imagine, this is not a recipe for high performance. In addition, the query optimizer has essentially zero opportunity to rewrite such queries for better performance.

Using a join is the relational solution, and will allow the optimizer to apply the tricks it knows. To emphasise: you will almost never want to use scalar functions where they might be called for more than a very small number of rows, as they are currently implemented. Functions that perform data access should be avoided even more than those that do not.